shipfeedAI news, curated daily

19:11:29 CET
29 JUN19:11:29shipfeed
pull to refreshlast sync
Just in — 30 new
§ models · storyline

Datacurve releases DeepSWE coding benchmark; GPT-5.5 leads at 70%

Datacurve releases the DeepSWE coding benchmark, a 113-task test spanning 91 open-source repositories and five languages, with GPT-5.5 leading at 70%.

May 27 · · primary fetch1 sourceupdated May 27 ·

Michael Nuñez / VentureBeat: Datacurve releases the DeepSWE coding benchmark, a 113-task test across 91 open-source repositories and five languages, and says GPT-5.5 is the leader at 70% — For months, the leading AI coding benchmarks have told enterprise buyers a comforting but misleading story: the top models are all roughly the same.

read full article on venturebeat.com
§ sources1 publication · timeline below
  1. venturebeat.comDatacurve releases the DeepSWE coding benchmark, a 113-task test across 91 open-source repositories and five languages, and says GPT-5.5 is the leader at 70% (Michael Nuñez/VentureBeat)primary