Microsoft has rolled out its first in-house AI models, MAI-Voice-1 and MAI-1-preview, signaling a move toward self-reliance and deeper integration within its ecosystem.
MAI-Voice-1 is a resource-efficient speech generation model that can produce up to a minute of audio in under a second using just one GPU.
It’s already powering features like Copilot Daily and Podcasts, and is accessible via Copilot Labs for customizable voice experiences.
Meanwhile, MAI-1-preview is a text foundation model trained on around 15,000 NVIDIA H100 GPUs.
It’s geared toward consumer-facing tasks that require instruction-following and helpful responses, with public testing underway through LMArena and gradual integration planned within Copilot.
Microsoft AI chief Mustafa Suleyman emphasized that these models are designed to perform above expectations for their size, especially in consumer use cases where the company has rich usage data to optimize performance.
The launch reflects Microsoft’s broader strategy to reduce dependence on external partners like OpenAI while diversifying AI models to serve a range of user needs.
You may also want to check out some of our other recent updates.
Subscribe to Vavoza Insider to access the latest business and marketing insights, news, and trends daily! 🗞️





