OpenAI Adds 750MW Of Cerebras AI Inference Capacity

OpenAI said it had partnered with Cerebras to add 750 megawatts of ultra-low-latency AI compute to its platform, with capacity coming online in stages through 2028.

The company said the new hardware would be used for real-time inference across its products.

Cerebras built systems around a single large chip that combined compute, memory, and bandwidth to reduce delays that slowed conventional AI hardware.

OpenAI said those systems were designed to handle long outputs and rapid back-and-forth interactions.

The companies said the new capacity would be integrated into OpenAI’s inference stack in phases and expanded across multiple workloads.

OpenAI said the goal was to make responses faster when users asked complex questions, generated code, created images, or ran AI agents.

Sachin Katti of OpenAI said the partnership added a dedicated low-latency option to its compute portfolio. Cerebras CEO Andrew Feldman said real-time inference would change how people built and interacted with AI models.

Why This Matters Today

If you rely on OpenAI for interactive or agent-based tasks, response time affected how much work you could do in a single session.

OpenAI stated that faster inference encouraged longer usage and supported more complex, real-time workloads.

The deal also showed how OpenAI was diversifying its infrastructure beyond traditional GPU clusters.

By adding purpose-built inference hardware, the company aimed to match different types of AI work to different systems.

The multi-year rollout mattered because it signaled a long-term commitment.

With capacity scheduled through 2028, OpenAI planned for sustained growth in real-time AI demand rather than short-term bursts.

Our Key Takeaways:

  • OpenAI partnered with Cerebras to add 750MW of low-latency inference capacity to its platform.

  • The hardware was designed to reduce response times for tasks such as code generation, image creation, and AI agents.

  • The capacity will be deployed in phases through 2028 as OpenAI scales real-time AI services.

You may also want to check out some of our other tech news updates.

Wanna know what’s trending online every day? Subscribe to Vavoza Insider to access the latest business and marketing insights, news, and trends daily with unmatched speed and conciseness. 🗞️

Subscribe to Vavoza Insider, our daily newsletter. Your information is 100% secure. 🔒

Subscribe to Vavoza Insider, our daily newsletter.
Your information is 100% secure. 🔒

Share With Your Audience

Read More From Vavoza...

Wanna know what’s
trending online?

Subscribe to access the latest business and marketing insights, news, and trends daily!