Security researchers at Mozilla's 0DIN platform have shown how a single compromised GitHub repo can take over a developer's machine the moment an AI coding tool like Claude Code runs its setup. The catch: the malicious…
Zvi Mowshowitz / Don't Worry About the Vase: GPT-5.6 system card indicates Sol is well below the level of most worrisome Mythos use cases, suggesting all GPT-5.6 versions could be released without delay — While…
360 founder Zhou Hongyi presents two AI security tools designed to compete with Anthropic's Mythos. One has already flagged 3,432 vulnerabilities. Zhou admits Chinese models trail Western ones by 20 to 30 percent, but…
via the-decoder.com·Click to report a broken or paywalled link. Two distinct reports hide the row.
SponsoredNimbuspaid placement
Featured partner · Agents
Need an agent shipped this quarter?
Nimbus builds production AI systems combining humans and AI end-to-end. From scoped pilot to production in 4 to 8 weeks.
Independent testing organization METR found that OpenAI's GPT-5.6 Sol cheated more than any publicly tested AI model before it, exploiting bugs in the test environment, extracting hidden solutions, and trying to cover…
Epoch AI's new MirrorCode benchmark tests whether AI models can recreate complete programs without access to the original code. Claude Opus 4.7 leads with a 56 percent solve rate, rebuilding a 16,000-line toolkit in…
via the-decoder.com·Click to report a broken or paywalled link. Two distinct reports hide the row.
Researchers introduced DFlash, a speculative decoding model that drafts entire token blocks in parallel, achieving up to 15x throughput on NVIDIA Blackwell GPUs.
via marktechpost.com·Click to report a broken or paywalled link. Two distinct reports hide the row.
Zhipu AI's GLM-5.2 nearly matches Claude Opus 4.7 in a Snowflake benchmark with 103 coding tasks at one-fifth the cost per output token. But the Chinese model burns through nearly twice as many tokens per task. Still…
via the-decoder.com·Click to report a broken or paywalled link. Two distinct reports hide the row.
Thomas Claburn / The Register: Nature publishes a peer-reviewed paper alleging that Microsoft's 2025 quantum breakthrough claims were based on “basic Python errors” and data cherry-picking — Nature…
via theregister.com·Click to report a broken or paywalled link. Two distinct reports hide the row.
13:41:02Google News — AI Products & Releases+8 sources
Seven AI agents work together like a newsroom. The "Data Journalist Agent" from Oxford and Stanford turns a CSV file into a finished interactive article with graphics, web research, and verifiable source links for 93…
via the-decoder.com·Click to report a broken or paywalled link. Two distinct reports hide the row.
Friday, June 19, 2026’s editionFriday, June 19, 2026
OpenAI researchers show that reinforcement learning on desired behavioral traits like truthfulness and corrigibility works across domains. Training on health data also improved deception detection, and the model scored…
via the-decoder.com·Click to report a broken or paywalled link. Two distinct reports hide the row.
Thursday, June 18, 2026’s editionThursday, June 18, 2026
Artificial Analysis: GLM-5.2 is the leading open weights model on Artificial Analysis' Intelligence Index, scoring 51, only behind Fable 5's 60, Opus 4.8's 56, and GPT-5.5's 55 — Z ai's GLM-5.2 is the new leading…
via artificialanalysis.ai·Click to report a broken or paywalled link. Two distinct reports hide the row.
OpenAI introduces Deployment Simulation, a method to predict AI model behavior before deployment using real conversation data to improve safety and evaluation accuracy.
via openai.com·Click to report a broken or paywalled link. Two distinct reports hide the row.
Perplexity published a guide demonstrating how its internal teams use its 'Computer' AI agent to automate workflows across various departments, including recruiting, design, and growth marketing.
via perplexity.ai·Click to report a broken or paywalled link. Two distinct reports hide the row.
New AgentPerf results from Artificial Analysis show how accelerated computing systems handle real-world agentic workloads, with NVIDIA GB300 NVL72 running up to 20x more agents per megawatt than NVIDIA Hopper.
via blogs.nvidia.com·Click to report a broken or paywalled link. Two distinct reports hide the row.
Thursday, June 11, 2026’s editionThursday, June 11, 2026
Anthropic's security team found that its Mythos Preview AI model can turn security patches for Firefox and the Windows kernel into working exploits within hours, for a few thousand dollars and no specialized knowledge…
via the-decoder.com·Click to report a broken or paywalled link. Two distinct reports hide the row.