§ feed · storyline

Mistral Large 2 + RIP Mistral 7B, 8x7B, 8x22B

Mistral releases Large 2 with 123B parameters, a 128k context window, and open weights under a research licence, while deprecating its older 7B, 8x7B, and 8x22B models.

Jul 25 · 01:44:31 · primary fetch1 sourceupdated Jul 25 · 01:44:31

Mistral Large 2 introduces 123B parameters with Open Weights under a Research License, focusing on code generation, math performance, and a massive 128k context window, improving over Mistral Large 1's 32k context. It claims better function calling capabilities than GPT-4o and enhanced reasoning. Meanwhile, Meta officially released Llama-3.1 models including Llama-3.1-70B and Llama-3.1-8B with detailed pre-training and post-training insights.

The Llama-3.1 8B model's 128k context performance was found underwhelming compared to Mistral Nemo and Yi 34B 200K. Mistral is deprecating older Apache open-source models, focusing on Large 2 and Mistral Nemo 12B. The news also highlights community discussions and benchmarking comparisons.

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.aiMistral Large 2 + RIP Mistral 7B, 8x7B, 8x22Bprimary01:44:31