Hacker News new | ask | show | jobs
by asfginino 2138 days ago
>Furthermore, Amazon hasn’t cracked the code on hyper-efficient GB into KB lossless compression only to squirrel it away only for use in voice assistants.

They could do speech recognition on the device and then ship off the plain text. I don't think they do this, but it is most certainly within their technical ability.

As a practical example, I have a copy of a 458,045 word audiobook on my computer and I just downloaded a copy of the e-book. The audiobook is just over 1 GiB, while the plain text of the e-book compressed with bz2 comes in at 800 KiB.

2 comments

Small knit pick, but speech to text isn’t lossless. How someone says the words has a lot of impact on the meaning of those words. It’s possible Amazon doesn’t care about that lost information but that is a completely different conversation.
That's true. Speech to text also isn't entirely accurate, which loses some information.

Even when they actually do ship audio off the device for processing, I'd be surprised if it's done losslessly.

For anyone curious about this, I highly recommend developing on CMU Sphinx for a weekend project. It will really paint some pictures about machine interpretation based on training data and the actual application code.
> They could do speech recognition on the device and then ship off the plain text

Not with the tiny cheap CPUs that are one these devices they couldn't. Or at least not very effective speech recognition.

I think you overestimate how CPU intensive the inference is.