ChatGPT is about to get a big upgrade with paid users soon able to access two-way voice conversations with the tool.
OpenAI also announced that the artificial intelligence (AI) chatbot, which was rolled out in November last year, will be able to look at images and understand what’s going on in them.
The voice feature will be on the iOs and Android apps, where users will be able to pick from five different voices. It is built on a new text-to-speech model developed by the company, and will also incorporate its Whisper AI tool, a speech-recognition system that can transcribe spoken words to text.
The five voices, a mix of male and female with American accents, are called Juniper, Sky, Cove, Ember and Breeze. OpenAI suggests the voices could be used for everything from reading a bedtime story to your children, to settling a debate at the dinner table.
OpenAI’s CEO Sam Altman has spoken about the need for regulation of AI due to the dangers it potentially poses to humanity, but his company appears to be pressing ahead with developments despite some calls for a pause.
A further development announced by the company in a blog post is that users will be able to show ChatGPT images, which the AI can view and analyse. The company claims it tested the model in “domains such as extremism and scientific proficiency” to help them deploy it responsibly.
Spotify tapping into OpenAI technology
OpenAI’s tools are also being used by another tech giant - Spotify - which announced on Monday a new feature for translations of podcasts.
The company said - also in a blog post - that the Spotify-developed tool uses the “latest innovations”, including OpenAI’s voice generation technology, to translate a speaker’s voice maintaining their style and tone.
It is starting with a pilot rollout of a number of episodes from podcasters such as Dax Shepard, Lex Fridman, and Monica Padman, with AI translating their episodes into languages including Spanish, French and German.