shipfeedAI news, curated daily

00:33:10 CET
21 MAY00:33:10shipfeed
pull to refreshlast sync
Just in — 30 new
§ feed · storyline

Introducing the SWE-Lancer benchmark

OpenAI introduces SWE-Lancer, a benchmark that tests frontier LLMs on real-world freelance software engineering tasks worth up to $1 million in aggregate payouts.

Feb 18 · · primary fetch1 sourceupdated Feb 18 ·

Can frontier LLMs earn $1 million from real-world freelance software engineering?

read full article on openai.com
§ sources1 publication · timeline below
  1. openai.comIntroducing the SWE-Lancer benchmarkprimary