Google’s DiffusionGemma AI Hits 1,000 Tokens Per Second—And It’s Free – Decrypt

Jun 11, 2026 - 00:00

Google’s DiffusionGemma AI Hits 1,000 Tokens Per Second—And It’s Free – Decrypt

In brief Google released DiffusionGemma, a free open-weight model that generates entire 256-token blocks simultaneously via text diffusion—hitting over 1,000 tokens per second on an NVIDIA H100, four times faster than standard autoregressive models. The custom drafter module DiffusionGemma needs for local inference doesn’t exist in any public runtime yet—not in mlx-lm, not in LM...

What's Your Reaction?

Like 0

Dislike 0

Love 0

Funny 0

Angry 0

Sad 0

Wow 0

Related Posts

Kraken Main App Adds On-Chain DEX Trading For 2,500+ Solana

Kraken Main App Adds On-Chain DEX Trading For 2,500+ So...

Jun 20, 2026

ETF outflows after Fed update, Polymarket puts BTC above $54K at 99.9%

ETF outflows after Fed update, Polymarket puts BTC abov...

Jun 20, 2026

Michael Saylor Shares Strategy’s Resilience and Growth Since 2022 – U.Today

Michael Saylor Shares Strategy’s Resilience and Growth ...

Jun 20, 2026

Venus Protocol Brings Stocks Into DeFi With New Collateral Feature

Venus Protocol Brings Stocks Into DeFi With New Collate...

Jun 20, 2026

BOJ deputy warns on inflation as Polymarket puts 2026 Fed hike odds at 66%

BOJ deputy warns on inflation as Polymarket puts 2026 F...

Jun 20, 2026

Franklin Templeton Files Bitcoin ETFs That Reinvest Stock Dividends

Franklin Templeton Files Bitcoin ETFs That Reinvest Sto...

Jun 20, 2026