not much happened today
not much happened today
OpenAI's GPT-5.5 achieves top-tier performance in long-horizon cyber tasks, matching or surpassing Claude Mythos Preview with a 71.4% pass rate and showing ongoing improvement beyond 100M tokens inference. OpenAI also released an Advanced Account Security update for ChatGPT enhancing phishing resistance. The Codex update expands beyond coding to general computer tasks, improving speed by up to 42% and introducing role-based onboarding and app integrations. Economically, GPT-5.5 Pro shows a slight SOTA improvement on CritPt with ~60% lower cost and token use compared to GPT-5.4 Pro.
In open-weight models, Qwen3.6 27B leads under 150B parameters with an Intelligence Index score of 46, featuring 262K context, native multimodal input, and efficient BF16 weights. Tencent's Hy3-preview (295B total, 21B active MoE) scores 42 on the Intelligence Index with strong scientific reasoning on CritPt. xAI's Grok 4.3 shows sharp improvements on agentic benchmarks with reduced cost.
- news.smol.ainot much happened todayprimary