Hacker News new | ask | show | jobs
by manishsharan 702 days ago
Are there any use cases that is driving this ? Is there a huge burning need for technology ?

Are kidnappers and con-men a huge under-served market that Google is hoping to serve ? Deep Fake videos not convincing enough to serve the need of fraudsters ?

I am totally against regulating AI but shit like this gives fodder to the other side.

5 comments

Voice anonymization is the use-case mentioned by the paper. If you're recording a video or communicating online, having this over your voice would benefit privacy by avoiding revealing your real voice that can be matched back to your face/name/job/etc. I think a lot of people are currently reluctant to use their voice at all online for privacy reasons, resorting to only text.

Also allows people uncomfortable with their natural voice, in particular transgender people, to communicate closer to how they wish to be perceived. Or even for someone to use their own natural voice from previous recordings if some temporary or chronic disease/disorder has impaired it.

There are probably a bunch of creative applications - like doing character voices for a D&D session or reading an audiobook. Obviously depends on the preferences of those involved, and many will currently dislike it on the basis of it being AI, but I think over time we'll see the tech integrated in interesting ways.

I imagine the majority of the use will be in entertainment/memes/satire - joining a call with an amusing voice on, or the equivalent of Snapchat's face filters. Not something critical that we couldn't do without, but still a fun application.

I don't see much benefit to kidnappers in this; if you just need to send an anonymous message without much concern about flow and latency, text or traditional TTS is fine.

Since the quality is pretty listenable, one use case I can see is youtubers who want to do voiceovers on their videos, but not be linked to their real world identity.

Heck, I can even see broadcasting uses. Imagine if every on-air personality had good target files made ahead of time, so then when they catch a cold, production runs their lapel mic feed through this, using the "good" target sample, and remove all the congestion and raspiness.

You're totally against regulating AI, but the idea that AI could aid anonymity makes you want to regulate AI?

> I am totally against regulating AI but shit like this gives fodder to the other side.

You think anonymity is so universally hated that it's actually bad PR for leaving AI completely unregulated? No other problems with AI that you can think of, and also no good reason why someone should be allowed to be anonymous?

> applicable to real-time communication scenarios like calls and video conferencing, and addressing use cases such as voice anonymization in these scenarios.

It’s not a desire I ever had. But maybe people are different?

Alternatively, building the solution was so much fun that the question of whether this is a problem that should be solved was never asked.

I had couple of usecases for this. One was one of my very young cousin usually has voice chat in his gaming sessions and I wanted to anonymize it.

The second was we got a very enthusiastic video spokesperson but unfortunately she has a very thick non-american accent and this can help us alleviate it.

This will not resolve your second issue as it replaces timbre but not accent.