shipfeedAI news, curated daily

00:32:37 CET
21 MAY00:32:37shipfeed
pull to refreshlast sync
Just in — 30 new
§ feed · storyline

DeepSeek's Open Source Stack

DeepSeek's Open Source Week produced several releases including fine-tuned models, embedding tools, diffusion-based code generation, and OCR datasets across multiple labs.

Mar 8 · · primary fetch1 sourceupdated Mar 8 ·

DeepSeek's Open Source Week was summarized by PySpur, highlighting multiple interesting releases. The Qwen QwQ-32B model was fine-tuned into START, excelling in PhD-level science QA and math benchmarks. Character-3, an omnimodal AI video generation model by Hedra Labs and Together AI, enables realistic animated content creation. Google DeepMind introduced the Gemini embedding model with an 8k context window, ranking #1 on MMTEB, alongside the Gemini 2.0 Code Executor supporting Python libraries and auto-fix features. Inception Labs' Mercury Coder is a diffusion-based code generation model offering faster token processing.

OpenAI released GPT-4.5, their largest model yet but with less reasoning ability than some competitors. AI21 Labs launched Jamba Mini 1.6, noted for superior output speed compared to Gemini 2.0 Flash, GPT-4o mini, and Mistral Small 3. A new dataset of 1.9M scanned pages was released for OCR benchmarking, with Mistral OCR showing competitive but not top-tier document parsing performance compared to LLM/LVM-powered methods. "Cracked engineers are all you need."

read full article on news.smol.ai
§ sources1 publication · timeline below
  1. news.smol.aiDeepSeek's Open Source Stackprimary
DeepSeek's Open Source Stack · shipfeed