Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now
OpenAI adds to an increasingly competitive AI voice market for enterprises with its new model, gpt-realtime, that follows complex instructions and with voices “that sound more natural and expressive.”
As voice AI continues to grow, and customers find use cases such as customer service calls or real-time translation, the market for realistic-sounding AI voices that also offer enterprise-grade security is heating up. OpenAI claims its new model provides a more human-like voice, but it still needs to compete against companies like ElevenLabs.
The model will be available on the Realtime API, which the company also made generally available. Along with the gpt-realtime model, OpenAI also released new voices on the API, which it calls Cedar and Marin, and updated its other voices to work with the latest model.
OpenAI said in a livestream that it worked with its customers who are building voice applications to train gpt-realtime and “carefully aligned the model to evals that are built on real-world scenarios like customer support and academic tutoring.”
AI Scaling Hits Its Limits
Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:
Turning energy into a strategic advantage
Architecting efficient inference for real throughput gains
Unlocking competitive ROI with sustainable AI systems
Secure your spot to stay ahead: https://bit.ly/4mwGngO
[embedded content]
The company touted the model’s ability to create emotive, natural-sounding voices that also align with how developers build with the technology.
Speech-to-speech models
The model operates within a speech-to-speech framework, enabling it to understand spoken prompts and respond vocally. Speech-to-speech models are ideally suited for real-time responses, where a person, typically a customer, interacts with an application.
For example, a customer wants to return some products and calls a customer service platform. They could be talking to an AI voice assistant that responds to questions and requests as if …