China’s Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude – Decrypt

Jun 8, 2026 - 23:00
China’s Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude – Decrypt
In brief Xiaomi and inference partner TileRT have broken 1,000 tokens per second on a 1-trillion-parameter model, a first at that scale, using a standard 8-GPU commodity node—not custom chips. The speed comes from FP4 quantization on the model’s expert layers and DFlash speculative decoding, which proposes a full block of tokens in one pass...

What's Your Reaction?

Like Like 0
Dislike Dislike 0
Love Love 0
Funny Funny 0
Angry Angry 0
Sad Sad 0
Wow Wow 0