Hacker News new | ask | show | jobs
by cjbassi 2311 days ago
For those on Linux, I've been working on a Talon inspired voice coding program called Osprey that uses the Google Cloud speech to text API: https://github.com/osprey-voice/osprey.

It's still very much a work in progress but it's already been working very well for me and I'm actually using it to type out this response right now.

1 comments

Why would you work with google when there are much more accurate open source speech recognizers based on Kaldi? With that specific usecase it is very easy to beat Google on accuracy.
I think google as the primary engine results in stuff like this: https://github.com/osprey-voice/osprey-starter-pack/blob/mas...

On the other hand it’s probably better at general (non command) English.

Yeah it uses a lot of machine learning and context based inference, which is great for dictating phrases but less so for commands.
Actually I don't think I ended up testing kaldi because it seemed difficult to set up but I'll give it a try now that you mention it.
Ok, if you want to start with Kaldi it is probably easier to check kaldi-active-grammar mention above or https://github.com/alphacep/vosk-api
Awesome, thanks, I'll check this out and the other one too.