OpenAI officially launched GPT‑Realtime, a speech‑to‑speech model that delivers a smoother, more expressive voice, which can switch languages mid-sentence or echo tone shifts, all through a single, low-latency pipeline.
It also announced that the Realtime API is now generally available, adding features like image input, SIP calling, and Model Context Protocol (MCP) support to make voice agents more capable and production-ready.
Two new voices, Cedar and Marin, debut exclusively with this update.
Compared to earlier models, GPT‑Realtime scored significantly higher on reasoning benchmarks (82.8% vs. 65.6%) and instruction-following tasks.
Making it smarter and more reliable for real-world voice agent use cases like complex support scripts, tool interaction, or multilingual conversations.
You may also want to check out some of our other recent updates.
Subscribe to Vavoza Insider to access the latest business and marketing insights, news, and trends daily! 🗞️











