Hacker News new | ask | show | jobs
by larsberg 2830 days ago
The issues with text input are a large part of why we added voice input even in our very first release! w.r.t. text rendering, we're doing a lot of fundamental work in rendering higher-quality text (you can't fix resolution fundamentals, but you can take into account the optics of the hardware better than we all do today), but nothing to announce there yet.
1 comments

What is doing the speech-to-text processing? Is this part of the Common Voice project? Is the code available to use outside of Firefox Reality?
I checked and it's using Mozilla's DeepSpeech library: https://github.com/MozillaReality/FirefoxReality/blob/dee6f4...

which is here: https://github.com/mozilla/DeepSpeech

DeepSpeech is the actual voice recognition library, CommonVoice is a project to improve its accuracy by gathering more voice data.

Interesting. The DeepSpeech github indicates that it's not capable of realtime transcription, even using a GPU. How is that fast enough for this use case?