§ feed · storyline

Mistral Small 3 24B and Tulu 3 405B

Mistral AI releases Mistral Small 3 (24B), AI2 releases Tülu 3 405B, Sakana AI launches TinySwallow-1.5B, and Alibaba Qwen releases Qwen 2.5 Max in a wave of open-model updates.

Jan 31 · 01:08:47 · primary fetch1 sourceupdated Jan 31 · 01:08:47

Mistral AI released Mistral Small 3, a 24B parameter model optimized for local inference with low latency and 81% accuracy on MMLU, competing with Llama 3.3 70B, Qwen-2.5 32B, and GPT4o-mini. AI2 released Tülu 3 405B, a large finetuned model of Llama 3 using Reinforcement Learning from Verifiable Rewards (RVLR), competitive with DeepSeek v3. Sakana AI launched TinySwallow-1.5B, a Japanese language model using TAID for on-device use.

Alibaba_Qwen released Qwen 2.5 Max, trained on 20 trillion tokens, with performance comparable to DeepSeek V3, Claude 3.5 Sonnet, and Gemini 1.5 Pro, and updated API pricing. These releases highlight advances in open models, efficient inference, and reinforcement learning techniques.

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.aiMistral Small 3 24B and Tulu 3 405Bprimary01:08:47