In the intersection between “pretty neat” and “dystopian fear,” Baidu has announced a new software that uses a neural network to clone a human voice from a seconds-long sample.
Once the voice is cloned, you can make it say anything. You can even switch around details like gender or accent. The technology isn’t entirely new: Baidu, Adobe, and smaller startups have all played around with it, developing technologies to clone voices from sample sizes of twenty minutes, thirty minutes, etc. But the idea of cloning a voice from a sample size of under one minute is kind of unnerving.
There are applications for the software outside of the obvious (that is, apps that make your friends say “I eat poo” and Spy vs. Spy-style future warfare). People have suggested using your cloned voice to read a book to your child from thousands of miles away, or digitally replacing voices for patients who have lost them.
But on the flip side of that, we could also be facing a new world of fraud possibilities — what if your grandma gets a call from your cloned voice asking for her social security number? What if Russian meme hackers send out a command in the fleet Admiral’s distinctive Alabama drawl to “deploy the missiles?”
New Scientist reports that the technology was able to fool voice recognition software with greater than 95% success in controlled tests, and humans rated the cloned voice with a score of 3.16 out of 4. You can listen to some samples here and check out the program for yourself.
Cover image: Thinkstock