Hacker News new | ask | show | jobs
by Flammy 3464 days ago
> It would cost more money than God in hardware to store every thing Alexa ever heard

Depends, first of all storing compressed audio isn't that space-expensive, especially in some long term data storage like s3. Additionally they could only be storing the transcriptions, but not the voice behind them, which would be a lot less data.

We don't know as Amazon hasn't been very forthcoming about the privacy aspects of Alexa. I personally suspect they are keeping some voice information so they can use it to improve their NLP. I hope they are doing so in a way that is detached from accounts / IDs, but you never know.

Additionally, you can indeed delete a record of the query from the app, but who knows if the voice data or even the query itself is still stored after deletion, just not visible to us end users.

3 comments

> but who knows if the voice data or even the query itself is still stored after deletion, just not visible to us end users.

Almost definitely yes. I've never known a tech company that truly deletes anything

"Never really delete" is actually standard advice. There are loads of reasons, mostly non-nefarious, why you may want or need that data.

Sometimes deleted stuff is archived offline or in slow warehouse databases that are not live, etc.

If it stored everything (and not just requests after the watch word) then it would end up trying to store audio or transcriptions of so many hours of tv and random conversations that it would be ridiculous. And that's just my house. I imagine most people have one somewhere near a TV, and it would do the same.
I'll point out Facebook is using this always on recording for advertising purposes and one of those is to fuel a nielsen-like TV/movie/audio popularity business.

Basically, Facebook's always-on audio listening on their mobile app (Messenger I believe, but might be both these days) was giving this data. I can't remember the name of the company, but here is another tech company doing the same:

> Symphony uses just one: an app, downloaded to the cellphones of its more than 15,000 panelists. Audio recognition software then picks up whatever people are tuning into, wherever they’re tuning into it: their TV sets, their laptops, or their smartphones. “[It] measures everything you want to measure from one approach,” says Bill Harvey, a media research consultant who’s worked with Symphony

https://theringer.com/tv-ratings-streaming-nielsen-symphony-...

I would think it would be possible and even beneficial to dedupe the data (15m homes x NFL broadcast, for example). Link a list of each echo's text conversion given similar data but perhaps different background noise. Or maybe getting data from multiple echos in different homes at the same time allows for "noise" filtering (people asking different things while the same background noise is present).
Can't vote you up enough.