shipfeedAI news, curated daily

00:38:39 CET
21 MAY00:38:39shipfeed
pull to refreshlast sync
Just in — 30 new
§ feed · storyline

Fine-tuning open LLM judges to outperform GPT-5.2

Researchers fine-tune GPT-OSS 120B using Direct Preference Optimization on 5,400 preference pairs, outperforming GPT-5.2 as an LLM judge at 15x lower cost and 14x faster inference.

Feb 2 · · primary fetch1 sourceupdated Feb 2 ·

Fine-tuned open-source LLM judges can outperform GPT-5.2 at evaluating model outputs. Using Direct Preference Optimization on just 5,400 preference pairs, we trained GPT-OSS 120B to beat GPT-5.2 on human preference alignment—at 15x lower cost and 14x faster inference speeds.

read full article on together.ai
§ sources1 publication · timeline below
  1. together.aiFine-tuning open LLM judges to outperform GPT-5.2primary