Home AI - Artificial Intelligence OpenAI Unveils ChatGPT’s Highly Realistic Voice Feature for Select Paid Users

OpenAI Unveils ChatGPT’s Highly Realistic Voice Feature for Select Paid Users

by admin

Tuesday marked the beginning of ChatGPT’s Advanced Voice Mode rollout by OpenAI, presenting the initial batch of users with GPT-4o’s exceptionally realistic auditory responses. Initially accessible to a select group of ChatGPT Plus subscribers, OpenAI plans to extend this feature to all Plus members by the fall of 2024.

The unveiling of GPT-4o’s voice capabilities in May left audiences astounded by its rapid and eerily human-like responses, particularly a voice dubbed Sky, which bore a striking similarity to Scarlett Johansson, known for voicing an artificial assistant in the film “Her.” Following the showcase, Johansson disclosed her refusal of several offers from OpenAI’s CEO Sam Altman to utilize her voice. After witnessing the GPT-4o showcase, she engaged legal representation to safeguard her likeness. Despite OpenAI’s denial of using Johansson’s voice, the voice demo was subsequently withdrawn. In June, OpenAI announced a postponement of Advanced Voice Mode’s launch to bolster its safety protocols.

One month on, the anticipation is partially over. OpenAI has indicated that the video and screen sharing features highlighted in its Spring Update will debut at a future date. For the time being, the remarkable GPT-4o demonstration remains a demonstration, albeit some premium users will gain access to the showcased voice feature in ChatGPT.

ChatGPT Acquires the Ability to Speak and Listen

While the existing Voice Mode in ChatGPT might already be familiar, OpenAI introduces Advanced Voice Mode as a new enhancement. Unlike the previous setup that required three different models to interpret voice to text, process with GPT-4, and then text to voice, GPT-4o is a multimodal system capable of executing these steps single-handedly, thereby reducing conversation latency. Furthermore, OpenAI asserts that GPT-4o can identify various emotional tones in speech, such as joy, sorrow, or even singing.

In this early phase, ChatGPT Plus users will have the opportunity to experience the remarkably realistic nature of OpenAI’s Advanced Voice Mode. Although TechCrunch has not tested the feature prior to this publication, a review will follow upon access.

OpenAI is introducing the new voice feature on a phased basis to meticulously observe its use. Users selected for the alpha release will receive a notification within the ChatGPT app and an email detailing how to activate the feature.

Since the demonstration of GPT-4o’s voice, OpenAI has engaged over 100 external testers adept in 45 languages to evaluate its capabilities. An early August release is expected for a report on these safety initiatives.

Advanced Voice Mode will be constrained to four predefined voices – Juniper, Breeze, Cove, and Ember – produced in association with professional voice actors. The previously demonstrated Sky voice will not feature in ChatGPT, according to OpenAI spokesperson Lindsay McCallum, emphasizing that “ChatGPT will not replicate the voices of individuals or celebrities, ensuring output remains consistent with these original voices.”

OpenAI aims to steer clear of deepfake scandals. In a notable case from January, AI startup ElevenLabs’s voice cloning technology was exploited to mimic President Biden, misleading voters in New Hampshire.

Moreover, OpenAI introduced additional filters to prevent the generation of copyrighted music or audio, responding to the rising legal challenges within the AI domain, particularly from record labels with a litigious track record who have already initiated lawsuits against AI-powered song generators like Suno and Udio.

Compiled by Techarena.au.
Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence

You may also like

About Us

Get the latest tech news, reviews, and analysis on AI, crypto, security, startups, apps, fintech, gadgets, hardware, venture capital, and more.

Latest Articles