§ feed · storyline

RWKV "Eagle" v5: Your move, Mamba

RWKV releases Eagle 7B (v5) with 7.52B parameters, claiming better-than-Mistral-7B evaluation scores and stronger multilingual performance than comparable 7B-class models.

Jan 30 · 02:20:56 · primary fetch1 sourceupdated Jan 30 · 02:20:56

RWKV v5 Eagle was released with better-than-mistral-7b evaluation results, trading some English performance for multilingual capabilities. The mysterious miqu-1-70b model sparked debate about its origins, possibly a leak or distillation of Mistral Medium or a fine-tuned Llama 2. Discussions highlighted fine-tuning techniques, including the effectiveness of 1,000 high-quality prompts over larger mixed-quality datasets, and tools like Deepspeed, Axolotl, and QLoRA.

The Nous Research AI community emphasized the impact of Rotary Position Embedding (RoPE) theta settings on LLM extrapolation, improving models like Mistral Instruct v0.2. Speed improvements in Mistral Tuna kernels reduced token processing costs, enhancing efficiency. The launch of Eagle 7B with 7.52B parameters showcased strong multilingual performance, surpassing other 7B class models.

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.aiRWKV "Eagle" v5: Your move, Mambaprimary02:20:56