§ feed · storyline

s1: Simple test-time scaling (and Kyutai Hibiki)

Researchers release s1, a reasoning model fine-tuned from Qwen 2.5 32B on 1,000 examples that extends inference compute by appending "Wait" tokens, reproducing OpenAI's o1 test-time scaling curve.

Feb 7 · 04:47:44 · primary fetch1 sourceupdated Feb 7 · 04:47:44

"Wait" is all you need introduces a novel reasoning model finetuned from Qwen 2.5 32B using just 1000 questions with reasoning traces distilled from Gemini 2.0 Flash Thinking, enabling controllable test-time compute by appending "Wait" to extend reasoning. Lead author Niklas Muennighoff, known for work on Bloom, StarCoder, and BIG-bench, highlights this method's efficiency and its reproduction of the famous o1 scaling chart. Additionally, Kyutai Moshi's Hibiki project demonstrates impressive offline French-English live translation on iPhone.

Recent AI model releases include DeepSeek R1 and R3 open source models, potentially marking a major open-source milestone, Hugging Face's SmolLM2 emphasizing data-centric training for small LMs, and IBM's Granite-Vision-3.1-2B, a small vision-language model with strong performance. Key research papers spotlight LIMO for minimal demonstration reasoning achieving high accuracy on AIME and MATH benchmarks, and Token-Assisted Reasoning mixing latent and text tokens to improve language model reasoning.

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.ais1: Simple test-time scaling (and Kyutai Hibiki)primary04:47:44