Hacker News new | ask | show | jobs
by johnjac 3146 days ago
Yes but we could could look at the amount of of data transmitted in total. Audio compression is well understood, and can infer within an range of usable quality, if any excess voice or other data is sent over the network.
2 comments

So what you're saying is, if a company like Amazon or Google has the excess bandwidth, it is beneficial for them to send way too much data in the first place in order to disguise what data is actually being sent.

Now, there is some security basis

http://www.cs.unc.edu/~fabian/papers/tissec2010.pdf

>Uncovering Spoken Phrases in Encrypted Voice over IP Conversations

Assuming its sending it as audio, and not as transcribed text which is both smaller and also much more compressible.
ASR is a hugely complex process that is handled by ML algorithms on Amazon's servers. The echo simply does not have the hardware to handle this on it's own.
Is it though? Not trying to be argumentative but I remember using dragon naturally speaking to do voice dictation way back in like 98 on a processor that makes today's average smartphone look like a supercomputer. I thought all the ML stuff was for figuring out context and the like, but straight transcription?
Modern voice codecs are extremely compact. An annotated text representation of voice will take up equivalent space.