Hacker News new | ask | show | jobs
by sterlind 2016 days ago
Converting to text loses data and will fail on unexpected languages or accents. It's better to use a specialized voice codec, which can have fantastically high compression rates, or just keep the (already annoyingly-companded) 8kbit voice stream around in its entirety. It's pretty small.
2 comments

You must mean 8Khz - which results in a 64kbps stream. (8000 bytes per second) The companding was actually a very good use of 8 bits per sample for voice, introducing little artifacts except at high amplitudes and the low pass filter. Nowadays I find it ridiculous mobile networks feel it's still necessary to compress the audio further - 64Kbps is nothing on modern mobile networks, ie VoLTE etc... WB-AMR is definitely an improvement with it's 16Khz sample rate and a bitrate lower than that of G.711, but mostly not supported between different mobile carriers...
You convert it to text in order to index it. If it becomes interesting in the future, you listen to the audio (maybe after getting permission from a judge in a secret court.).