§ feed · storyline

Qwen 2 beats Llama 3 (and we don't know how)

Alibaba releases Qwen 2 under Apache 2.0, claiming to outperform Llama 3 across open models with multilingual support in 29 languages and benchmark scores including MMLU 82.3 and HumanEval 86.0.

Jun 7 · 00:33:41 · primary fetch1 sourceupdated Jun 7 · 00:33:41

Alibaba released Qwen 2 models under Apache 2.0 license, claiming to outperform Llama 3 in open models with multilingual support in 29 languages and strong benchmark scores like MMLU 82.3 and HumanEval 86.0. Groq demonstrated ultra-fast inference speed on Llama-3 70B at 40,792 tokens/s and running 4 Wikipedia articles in 200ms. Research on sparse autoencoders (SAEs) for interpreting GPT-4 neural activity showed new training methods, metrics, and scaling laws.

Meta AI announced the No Language Left Behind (NLLB) model capable of high-quality translations between 200 languages, including low-resource ones. "Our post-training phase is designed with the principle of scalable training with minimal human annotation," highlighting techniques like rejection sampling for math and execution feedback for coding.

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.aiQwen 2 beats Llama 3 (and we don't know how)primary00:33:41