I cloned my voice in seconds using a free AI app, and we really need to talk about speech synthesis

That voice you hear – even one you recognize – might not be real, and you may have no way of knowing. Voice synthesis is not a new phenomenon, but a growing number of freely available apps are putting this powerful voice-cloning capability in the hands of ordinary people, and the ramifications could be far reaching and unstoppable.

A recent Consumer Reports study that looked at half a dozen such tools puts the risks in stark relief. Platforms like ElevenLabs, Speechify, Resemble AI, and others use powerful speech synthesis models to analyze and recreate voices, and sometimes with little-to-no safeguards in place. Some try – Descript, for example, asks for recorded voice consent before the system will recreate a voice signature. But others are not so careful.

I found an app called PlayKit from Play.ht that will let you clone a voice for free for three days and then charges you $5.99 a week. The paywall is in theory something of a barrier against potential misuse – except that I was able to clone a voice without starting the trial.

Say, 'Too easy'

The app whisks you through setup and then presents some pre-made voice clones, including ones for President Donald Trump and Elon Musk (yes, you can make the President say things like, 'I think DEI should be supported and expanded around the world"). But at the top is a 'Clone a voice' option.

All I had to do was select a video from my photos library and upload it. Videos must be at least 30 seconds long (but not longer than a minute) and in English. I could have chosen one with anyone in it and, if I had, say, filmed a clip of a George Clooney interview, I could have uploaded that (more on that later).

The system quickly analyzed the audio. The app doesn't tell you if this is being done locally or in the cloud, but I'll assume the latter, since such powerful models rarely work locally on a mobile device (see ChatGPT in Apple Intelligence). I saved my voice clone with my name so that I could select it again from the list of cloned voices.

When I want my clone to say something in my voice, I simply type in the text and hit a big Generate button. That process usually takes 10 to 15 seconds.

The voices PlayKit generates, including mine, are eerily accurate. If I have one criticism, it's that the tone and emotion are a bit off. Cloned me sounds the same whether it's talking about what to pick up for dinner or saying it's been in a terrible car crash. Even exclamation points do not change the expression.

And yet, I could see people being fooled by this. Remember, anyone with access to 30 seconds of video of you speaking could effectively clone your voice and then use it as they wish. Sure, they'd have to eventually pay $5.99 a week to keep using it, but if someone is planning a financial scam, they might think it's worth it.

Platforms like this that do not require explicit permission for voice cloning are sure to proliferate, and my concern is that there are no safeguards or regulations in sight. Services like Descript, which require audio consent from the clone target, are outliers.

Play.ht claims that it protects people's voice rights. Here's an excerpt from its Ethical AI page:

Our platform values intellectual property rights and personal ownership. Users are permitted to clone only their own voices or those for which they have explicit permission. This strict policy is designed to prevent any potential copyright infringement and uphold a high standard of respect and responsibility.

It's a high-minded promise, but the reality is that I started recording 30-second clips of famous movie monologues by Benedict Cumberbatch and Al Pacino, and in less than a minute, had usable voice clones for both actors.

What's needed here is global AI regulation, but that needs agreement and cooperation at the government level, and right now that's not forthcoming. In 2023, then-President Joe Biden signed an Executive Order on AI that sought in part to offer some regulatory guidance (he followed up with another AI related order early this year). The Trump administration is allergic to government regulation (and any Biden executive order) and quickly revoked it. The problem is that it has yet to propose anything to replace it. It seems the new plan is to hope that AI companies will be good digital citizens, and at least try to do no harm.

Unfortunately, most of these companies are like weapons manufacturers. They're not harming people directly – no one who makes a voice cloner is calling your aging uncle and convincing him with your voice clone that he urgently needs to wire you of thousands of dollars – but some people who are using their AI weapons are.

There's no easy solution for what I fear will become a voice-cloning crisis, but I would suggest that you no longer outright trust the voices you hear in videos, on the phone, or in voice messages. If you're in any doubt, contact the relevant person directly.

In the meantime, I hope that more voice platforms insist on voice and / or documented permission before they allow users to clone anyone's voice.

You might also like

How It works

Search Crack for

Latest IT News

Mar 23
A new report gives us some more information about what to expect from the Samsung smart glasses.
Mar 22
Comparing ChatGPT, Gemini, Claude, and Perplexity AI search.
Mar 21
How to transform text prompts into realistic videos with AI
Mar 21
We all know Siri needs an update, but I didn't quite realize it was this bad.
Mar 21
Zapier AI is a powerful cross-platform tool for automating everyday tasks. With the help of AI, you can create complex automations using natural language. Here’s what you need to know.
Mar 21
Mention is a free web-based tool that can generate social media posts and captions based on short text prompts. Here’s how to use it.
Mar 21
AI tools sometimes generate false information, but these so-called "hallucinations" aren’t just errors – they reveal how AI thinks and why it sometimes gets things wrong.

Latest cracks