shipfeedAI news, curated daily

01:19:09 CET
21 MAY01:19:09shipfeed
pull to refreshlast sync
Just in — 30 new
§ feed · storyline

DataComp-LM: the best open-data 7B model/benchmark/dataset

DataComp team releases DCLM, a 7B language model trained on 2.5T tokens from its 240T-token open dataset, alongside a benchmark showing stronger scaling trends than FineWeb.

Jul 20 · · primary fetch1 sourceupdated Jul 20 ·

DataComp team released a competitive 7B open data language model trained on only 2.5T tokens from the massive DCLM-POOL dataset of 240 trillion tokens, showing superior scaling trends compared to FineWeb. OpenAI launched GPT-4o mini, a cost-effective model with 82% MMLU and performance near GPT-4-Turbo, aimed at developers for broad applications. NVIDIA and Mistral jointly released the Mistral NeMo 12B model featuring a 128k token context window, FP8 checkpoint, multilingual support, and Apache 2.0 licensing.

DeepSeek announced DeepSeek-V2-0628 as the top open-source model on the LMSYS Chatbot Arena leaderboard with strong rankings in coding, math, and hard prompts. This news highlights advances in dataset design, model efficiency, and open-source contributions in the AI community.

read full article on news.smol.ai
§ sources1 publication · timeline below
  1. news.smol.aiDataComp-LM: the best open-data 7B model/benchmark/datasetprimary