Hacker News new | ask | show | jobs
by fraud 1889 days ago
My issue with the whole idea of background recording for advertising is that it would be incredibly costly to store this data, transcribe the audio and turn it into anything even remotely meaningful for advertisers. I also don’t know a lot on this subject so if anyone has better info that’d be great.
4 comments

You don't need to store the data, just transcribe it. That's basically the business model for Siri, Alexa et al. If you're worried about cost, just offload the work to the cell phone and accept the less-than-100% transcription.

The only reason I don't think the big players are doing this _is_ the potential for scandal. Random apps on the app store that ask for a million permissions, on the other hand, are probably doing this.

It only takes one clever hacker looking to make a name for themselves. With that said, there are plenty of cases where companies _were_ caught spying, so maybe it's not so cut and dry.

You can easily process the audio on the fly and reduce it to a probabilistic estimate of whether a tag from a predefined topic set was present in the conversation. Doesn't need to be 100% accurate. You don't need to store the audio - just stream it through the recognizer. The output of such recognizer will be something on the order of 8-32 bytes (an int for tag, a float for probability, an int64 for timestamp), possibly less if one's clever - and it only needs to be stored until the next opportunity to send it out.

Also: people seem to be looking at modern speech recognizers on their phones and wrongly concluding that speech recognition in general is very compute-intensive. It isn't, if you're willing to make some sacrifices on accuracy and generality, and to do it locally instead voice data off to a cloud somewhere. A proper benchmark here isn't Siri or Google Assistant - it's Microsoft Speech API, as shipped with Windows 12+ years ago.

> store this data, transcribe the audio and turn it into anything even remotely meaningful for advertisers

I disagree - even shitty, low CPU on-device transcription could give a signal to advertising algos.

I doubt this is being done, but it is definitely within the range of possibility and wouldn't even drain your phone battery that much.

All is needs to be is a list of keywords associated with your advertising profile.