Thinking Machines Lab ships its first model and argues interactivity is what OpenAI gets wrong about voice
Thinking Machines Lab releases its first multimodal AI model, processing audio, video, and text in 200-millisecond chunks and targeting OpenAI and Google in real-time voice interaction quality.
Mira Murati's start-up presents its first AI model and aims to free voice AI from the question-and-answer model. The model processes audio, video and text in 200-millisecond chunks in parallel and aims to beat OpenAI's GPT Realtime 2 and Google's Gemini Live in terms of interaction quality.
The article Thinking Machines Lab ships its first model and argues interactivity is what OpenAI gets wrong about voice appeared first on The Decoder.