Hacker News new | ask | show | jobs
by stbullard 1158 days ago
Nothing in this article gives any reason to believe that the voice on the other end of the phone was AI-generated.

Occam’s razor suggests it was more likely a human pretending to be her daughter.

I’m guessing after she realized that the voice wasn’t her daughter, the mother convinced herself it must have been a deepfake to explain herself having been so easily convinced.

1 comments

This is likely as currently only elevenlabs has the publically available technology to do this and is SaaS only (and i presume requires law enforcement traceable usage after that first weekend and 4chan's fun with it)
I've played with elevenlabs' solution; its really good, but I don't think it could believably emulate the voice of someone under panic duress. The voices it produces have a pretty believable, but flat, affect. Maybe it has options I haven't played with, or maybe you could train it to add panic to the voice.

Idk, something about this story feels fishy to me. I don't doubt that we're headed toward this kind of future; but we're talking about very high level spear-phishing to accomplish this today (knowing the individual isn't home, having samples of their voice, highly sophisticated AI knowledge to make the voice and add elements of panic duress, to actually succeed you'd also need to compromise lines of communication to the individual, e.g. do this while they're on a trip to the amazonian rainforest or something). This feels more like an AI hit-piece than an actual done-with-AI story.

I cloned my friend's voice a few weeks ago using Tortoise-TTS using 20 seconds of audio and Google Colab: https://github.com/neonbjb/tortoise-tts

Other friends said it sounded really like him.

I did a quick search and found this among a list of many: https://voice.ai/