Hacker News new | ask | show | jobs
by skykooler 2830 days ago
What is doing the speech-to-text processing? Is this part of the Common Voice project? Is the code available to use outside of Firefox Reality?
1 comments

I checked and it's using Mozilla's DeepSpeech library: https://github.com/MozillaReality/FirefoxReality/blob/dee6f4...

which is here: https://github.com/mozilla/DeepSpeech

DeepSpeech is the actual voice recognition library, CommonVoice is a project to improve its accuracy by gathering more voice data.

Interesting. The DeepSpeech github indicates that it's not capable of realtime transcription, even using a GPU. How is that fast enough for this use case?