§ feed · storyline

Qwen with Questions: 32B open weights reasoning model nears o1 in GPQA/AIME/Math500

Alibaba's Qwen team releases QwQ-32B, an open-weight reasoning model that outperforms GPT-4o and Claude 3.5 Sonnet on GPQA, AIME, and Math500 benchmarks.

Nov 28 · 02:23:25 · primary fetch1 sourceupdated Nov 28 · 02:23:25

DeepSeek r1 leads the race for "open o1" models but has yet to release weights, while Justin Lin released QwQ, a 32B open weight model that outperforms GPT-4o and Claude 3.5 Sonnet on benchmarks. QwQ appears to be a fine-tuned version of Qwen 2.5, emphasizing sequential search and reflection for complex problem-solving. SambaNova promotes its RDUs as superior to GPUs for inference tasks, highlighting the shift from training to inference in AI systems.

On Twitter, Hugging Face announced CPU deployment for llama.cpp instances, Marker v1 was released as a faster and more accurate deployment tool, and Agentic RAG developments focus on integrating external tools and advanced LLM chains for improved response accuracy. The open-source AI community sees growing momentum with models like Flux gaining popularity, reflecting a shift towards multi-modal AI models including image, video, audio, and biology.

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.aiQwen with Questions: 32B open weights reasoning model nears o1 in GPQA/AIME/Math500primary02:23:25