§ agents · storyline

Build realtime voice agents on AI Gateway

Cloudflare launches AI Gateway Audio with realtime voice, text-to-speech, and speech-to-text capabilities integrated across OpenAI and xAI models.

today · 09:00:00 · primary fetch1 sourceupdated today · 09:00:00

now supports audio/voice. You can add realtime voice, text to speech, and speech to text with the same calls you already use for text, image, and video, routed through AI Gateway alongside every other modality.AI Gateway Audio launches with models from and . Each call gets the same provider routing, observability, spend controls, and bring-your-own-key support you already use for your other models.OpenAIxAI These capabilities are in beta and available in 7.AI SDK Realtime, speech, and transcription model are supported on .AI SDK 7 Realtime turns your app into something a user can hold a conversation with.

When they speak, the model responds right away. Because it replies in the moment instead of waiting for a full turn, users can interrupt and talk over it the way they would with a person. It fits voice assistants, customer support agents, hands-free tools, and anywhere a user would rather talk than type. What sets it apart from chaining models together is that a single realtime model hears audio and produces audio directly, instead of running a speech-to-text, then language model, then text-to-speech pipeline. In the browser, the hook manages the WebSocket connection, microphone…

read full article on vercel.com ↗

§ sources2 publications · timeline below

vercel.comBuild realtime voice agents on AI Gatewayprimary09:00:00
vercel.comRealtime voice, speech, and transcription now supported on AI Gateway02:00:00

§ how this story moved

02:00:00primary — Vercel — Changelog publishes the launch post.
09:00:00Vercel — Changelog picks up coverage.