OpenAI unveiled a voice-cloning tool called "Voice Engine" on Friday, designed to replicate someone's speech based on a 15-second audio sample. However, the company plans to keep the tool under strict control until safeguards are in place to prevent audio fakes intended to deceive listeners.
The San Francisco-based company acknowledged the serious risks associated with generating speech that resembles people's voices, especially in an election year. To address these concerns, OpenAI is collaborating with partners from various sectors, including government, media, entertainment, education, and civil society, to gather feedback and ensure responsible development.
Disinformation researchers have expressed concerns about the potential misuse of AI-powered voice cloning tools in pivotal election years due to their accessibility, affordability, and difficulty in tracing their origin.
OpenAI stated that it is taking a cautious approach to releasing Voice Engine more widely, given the potential for synthetic voice misuse. This comes after a political consultant admitted to orchestrating a robocall impersonating US President Joe Biden during the New Hampshire primary, raising alarms about the use of deepfake technology in political campaigns.
Partners testing Voice Engine must adhere to rules requiring explicit and informed consent from individuals whose voices are duplicated using the tool. Additionally, audiences must be informed when they are listening to AI-generated voices. OpenAI has implemented safety measures such as watermarking to trace the origin of any audio generated by Voice Engine and proactive monitoring of its usage.
The introduction of Voice Engine underscores the growing need for robust safeguards to mitigate the risks associated with AI-generated deepfake content, particularly in sensitive contexts such as elections.
Comment: