shipfeedAI news, curated daily

23:56:02 CET
20 MAY23:56:02shipfeed
pull to refreshlast sync
Just in — 30 new
§ feed · storyline

Microsoft releases VibeVoice speech-to-text model with speaker

Microsoft releases VibeVoice, an MIT-licensed speech-to-text model with speaker diarization that transcribes one hour of audio in roughly nine minutes on Apple Silicon hardware.

Apr 28 · · primary fetch1 sourceupdated Apr 28 ·

Microsoft's MIT licensed VibeVoice speech-to-text model (think Whisper with speaker diarization) is really good - my notes on running the 5.71GB 4bit MLX conversion on an M5 MacBook, using about 60GB of RAM at peak and transcribing 1hr of audio in ~9 mins simonwillison.net/2026/Apr/27/...

read full article on bsky.app
§ sources1 publication · timeline below
  1. bsky.appMicrosoft's MIT licensed VibeVoice speech-to-text model (think Whisper with speaker diarization) is really good - my notes on running the 5.71GB 4bit MLX conversion on an M5 MacBook, using about 60GB…primary