shipfeedAI news, curated daily

01:22:30 CET
21 MAY01:22:30shipfeed
pull to refreshlast sync
Just in — 30 new
§ feed · storyline

Large Reasoning Models Fail to Follow Instructions During Reasoning: A Benchmark Study

ReasonIF benchmark finds frontier large reasoning models fail to follow instructions during the reasoning process over 75% of the time, tested across languages, formatting, and length constraints.

Oct 22 · · primary fetch1 sourceupdated Oct 22 ·

ReasonIF finds frontier LRMs fail to follow reasoning instructions >75% of the time; introduces a benchmark across languages, formatting, and length.

read full article on together.ai
§ sources1 publication · timeline below
  1. together.aiLarge Reasoning Models Fail to Follow Instructions During Reasoning: A Benchmark Studyprimary