§ feed · storyline

Google's Gemma 4 AI models get 3x speed boost by predicting future tokens

Google's Gemma 4 models receive a 3x speed boost through speculative decoding that predicts future tokens, with no reported loss in output quality.

May 6 · 17:44:20 · primary fetch1 sourceupdated May 6 · 17:44:20

Up to 3x the speed with no loss of quality—is it too good to be true?

§ sources1 publication · timeline below