| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by asfginino 2138 days ago

>Furthermore, Amazon hasn’t cracked the code on hyper-efficient GB into KB lossless compression only to squirrel it away only for use in voice assistants.

They could do speech recognition on the device and then ship off the plain text. I don't think they do this, but it is most certainly within their technical ability.

As a practical example, I have a copy of a 458,045 word audiobook on my computer and I just downloaded a copy of the e-book. The audiobook is just over 1 GiB, while the plain text of the e-book compressed with bz2 comes in at 800 KiB.

2 comments

zachm0 2138 days ago

Small knit pick, but speech to text isn’t lossless. How someone says the words has a lot of impact on the meaning of those words. It’s possible Amazon doesn’t care about that lost information but that is a completely different conversation.

link

asfginino 2138 days ago

That's true. Speech to text also isn't entirely accurate, which loses some information.

Even when they actually do ship audio off the device for processing, I'd be surprised if it's done losslessly.

link

scollet 2137 days ago

For anyone curious about this, I highly recommend developing on CMU Sphinx for a weekend project. It will really paint some pictures about machine interpretation based on training data and the actual application code.

link

lotu 2138 days ago

> They could do speech recognition on the device and then ship off the plain text

Not with the tiny cheap CPUs that are one these devices they couldn't. Or at least not very effective speech recognition.

link

MiroF 2137 days ago

I think you overestimate how CPU intensive the inference is.

link