AI company OpenAI is beginning to roll out new voice features for its ChatGPT chatbot to a small number of ChatGPT Plus subscribers in an early alpha trial, it said on X on Tuesday.
The startup previewed advanced voice mode during its Spring Update in May, which is where it also debuted its GPT-4o model.
Users with access have jumped on social media to share their initial experiences, which include getting help with French pronunciations, mimicking an airline pilot speaking from the cockpit and imitating seven US regional dialects. The New York and Midwestern accents could use a little work, but the chatbot knows that New Yorkers fold their pizza.
OpenAI isn’t alone in its ambitions for chatbot voice functionality for subscribers who pay $20 per month for perks like early access. Google, too, shared its plans for a more conversational Gemini chatbot via its Gemini Live feature for Gemini Advanced subscribers, who also pay $20 per month. Meta’s Meta AI chatbot can also chat with users who are wearing its Ray-Ban glasses.
This is one example of how technology companies continue to roll out new models and features in an appeal to users that is also an ongoing game of one-upmanship. The prize? The biggest share of the generative AI market, which is projected to be worth $1.3 trillion by 2023.
Hey, ChatGPT
According to OpenAI, advanced voice mode allows you to have more natural real time conversations with ChatGPT. It also senses and responds to your emotions — and you can interrupt if you want.
You can call up ChatGPT with a familiar phrase: “Hey, ChatGPT.”
Beyond that, details about what exactly this advanced functionality includes are unclear. A spokesperson didn’t respond to a request for comment.
Subscribers in the alpha test will receive a notice in the ChatGPT app, along with an email with instructions about how to use it. The goal of the early trial is to monitor usage and improve the model’s capabilities and safety prior to wider rollout, a spokesperson said in an earlier email.
OpenAI will expand access to additional subscribers over the next few weeks and plans to offer advanced voice functionality to all Plus members in the fall. In addition to early access to new features, Plus members also receive an always-on connection and unlimited access to GPT-4o. (If you use the free version, you’ll be bumped down to the earlier GPT-3.5 model if you ask too many questions or if traffic is high.)
ChatGPT first introduced voice functionality in September 2023.
Advanced voice mode will include four preset voices, Breeze, Cove, Ember and Juniper, which OpenAI developed with voice actors in 2023. There was originally a fifth voice, Sky, but it was paused after actor Scarlett Johansson, who played the voice of the virtual assistant Samantha in the 2013 movie Her, complained about similarities to her own voice.
CEO Sam Altman released a statement apologizing to Johansson but said the voice wasn’t meant to resemble hers.
In a related blog post, OpenAI said it picked the voice actors for its voices based on finding talent from diverse backgrounds, as well as voices that feel timeless, voices that are approachable and trustworthy, voices that are warm, engaging and charismatic, and voices that are natural and easy to listen to.
OpenAI said ChatGPT can’t impersonate voices, and it has added filters that will block requests to generate copyrighted audio.