Hacker News new | ask | show | jobs
by Vendan 3091 days ago
You'd imagine, but as someone that's tried, voice recognition is just one part of it, and is a rather hard problem in terms of computing power required. Note that the linked DeepSpeech stuff is a tensorflow based solution, and only hits a xRT of 0.44 on a GTX 1070. While that is slightly better then twice realtime, I really doubt anything less powerful then a handful of years old GPU is going to pull it off in realtime, and def. not a rPI or similar.

edit: follow up clarification, the echo is NOT doing voice processing on the device itself, it ships it up to the cloud to do so. You could of course set up something similar to that using a raspberry pi and shipping audio to your desktop to be processed.