| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Aachen 1175 days ago

I'd be curious what the false positive rate on that is. Can you clone anyone's by collecting a set of ten voices with unique timbre reading the required statement plus pitch control to get close enough? A hundred? Or can you trick the neural net by giving it something that sounds like white noise to humans until the NN triggers in the right way and goes "ok yep that's a match, you're authorised now"?

Probably not something we'll get to hear as part of the PR pitch.

Or is the consent statement the thing that will be cloned and is there no separate training audio? Then it might actually work and you'll just have to get close enough that the human you're trying to fool can't distinguish anymore (defeating the need for this tech in the first place, at least in targeted rather than automated cases).

2 comments

ros86 1175 days ago

Yeah, good point - don't know. When I tried I actually did get a (personal?) email saying that it didn't match closely enough. After uploading another sample (based on a different text) it went through.

I like your idea of just training on the consent text! That wasn't the case when I tried it as you needed around 3h (optimally) of training data.

link

cortesoft 1175 days ago

If someone has the capability to trick the service like that, they likely have the capability to recreate the functionality themselves.

link

Aachen 1175 days ago

With a couple soundalike voices and changing the pitch in Audacity? That's a far, far cry from doing cutting edge neural networks that clone voices with samples of less than half a minute.

If you mean the white noise, I meant that as a brute force attack because, to do it more targeted (to know what it'll accept as seeming like your target voice), you'd likely need their exact model rather than doing your own.

link