shipfeedAI news, curated daily

01:27:13 CET
21 MAY01:27:13shipfeed
pull to refreshlast sync
Just in — 30 new
§ feed · storyline

OpenAI takes on Gemini's Deep Research

OpenAI releases the full o3 agent with a Deep Research variant achieving SOTA results on GAIA, limited to 100 queries per month with a higher-rate tier planned.

Feb 4 · · primary fetch1 sourceupdated Feb 4 ·

OpenAI released the full version of the o3 agent, with a new Deep Research variant showing significant improvements on the HLE benchmark and achieving SOTA results on GAIA. The release includes an "inference time scaling" chart demonstrating rigorous research, though some criticism arose over public test set results. The agent is noted as "extremely simple" and currently limited to 100 queries/month, with plans for a higher-rate version.

Reception has been mostly positive, with some skepticism. Additionally, advances in reinforcement learning were highlighted, including a simple test-time scaling technique called budget forcing that improved reasoning on math competitions by 27%. Researchers from Google DeepMind, NYU, UC Berkeley, and HKU contributed to these findings. The original Gemini Deep Research team will participate in the upcoming AI Engineer NYC event.

read full article on news.smol.ai
§ sources1 publication · timeline below
  1. news.smol.aiOpenAI takes on Gemini's Deep Researchprimary