§ feed · storyline

Qwen 1.5 Released

Qwen 1.5 releases with up to 32k token context and support for Hugging Face transformers and quantized models.

Feb 7 · 00:40:32 · primary fetch1 sourceupdated Feb 7 · 00:40:32

Chinese AI models Yi, Deepseek, and Qwen are gaining attention for strong performance, with Qwen 1.5 offering up to 32k token context and compatibility with Hugging Face transformers and quantized models. The TheBloke Discord discussed topics like quantization of a 70B LLM, the introduction of the Sparse MoE model Sparsetral based on Mistral, debates on merging vs fine-tuning, and Direct Preference Optimization (DPO) for character generation.

The Nous Research AI Discord covered challenges in Japanese Kanji generation, AI scams on social media, and Meta's VR headset prototypes showcased at SIGGRAPH 2023. Discussions also included fine-tuning frozen networks and new models like bagel-7b-v0.4, DeepSeek-Math-7b-instruct, and Sparsetral-16x7B-v2.

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.aiQwen 1.5 Releasedprimary00:40:32