shipfeedAI news, curated daily

20:34:07 CET
29 JUN20:34:07shipfeed
pull to refreshlast sync
Just in — 30 new
§ models · storyline

Microsoft's Lens model shows detailed captions beat scale in image

Microsoft Research releases Lens, a 3.8-billion-parameter text-to-image model trained on 800 million GPT-4.1-generated captions that matches larger rivals at lower cost, with code and weights openly available.

Jun 8 · · primary fetch1 sourceupdated Jun 8 ·

Microsoft Research presents Lens, a text-to-image model with just 3.8 billion parameters that matches much larger rivals on benchmarks, at a fraction of the training cost. The secret sauce: 800 million detailed image captions generated by GPT-4.1 instead of vague web alt-text.

Code and weights are openly available under an open-source license. The article Microsoft Research's Lens proves detailed captions matter more than raw scale for training efficient image generators appeared first on The Decoder.

read full article on the-decoder.com
§ sources1 publication · timeline below
  1. the-decoder.comMicrosoft Research's Lens proves detailed captions matter more than raw scale for training efficient image generatorsprimary