§ feed · storyline

Xai releases Grok 4 Fast, distilled model with 2m token context

xAI releases Grok 4 Fast, a distilled reasoning model running at 344 tokens/second with a 2 million token context window and both reasoning and non-reasoning modes.

Sep 19 · 07:44:39 · primary fetch1 sourceupdated Sep 19 · 07:44:39

xAI announced Grok 4 Fast, a highly efficient model running at 344 tokens/second, offering reasoning and nonreasoning modes and free trials on major platforms. Meta showcased its neural band and Ray-Ban Display with a live demo that experienced hiccups but sparked discussion on live hardware demos and integration challenges. Meta is also developing a first-party "Horizon Engine" for AI rendering and released Quest-native Gaussian Splatting capture tech. New model releases include Mistral's Magistral 1.2, a compact multimodal vision-language model with improved benchmarks and local deployment; Moondream 3, a 9B-parameter MoE VLM focused on efficient visual reasoning; IBM's Granite-Docling-258M, a document VLM for layout-faithful PDF to HTML/Markdown conversion; and ByteDance's SAIL-VL2, a vision-language foundation model excelling at multimodal understanding and reasoning at 2B and 8B parameter scales.

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.aiGrok 4 Fast: Xai's distilled, 40% more token efficient, 2m context, 344 tok/s frontier modelprimary07:44:39