Hacker News new | ask | show | jobs
by punchingwater 3120 days ago
Don't forget Kaldi!

https://github.com/kaldi-asr/kaldi

1 comments

The problem with Kaldi is that it's virtually impossible to get a dictation model working with Kaldi unless you have a doctorate in speech recognition. There is no "I know basic programming, but little about speech recognition" documentation for Kaldi.
Between learning curve and dependency hell, I've never managed good results with Kaldi, Simon, or Sphinx. It's unfortunate; hopefully we'll get an easy to use option soon.
When was the last time you tried sphinx? The library has changed a LOT. Their guides, new website and other resources basically walk you from zero knowledge to working demo.
On, I will have to try again:) Thanks for the tip; I always thought Sphinx should be ideal, it was just too much work to get it working.
There's the "Kaldi for Dummies" tutorial [1], which helped me to the point of creating a speech recognition program that could distinguish digits in recordings of my voice. I guess that's the documentation you're looking for.

My personal problem with Kaldi is that I don't have enough RAM in my cheap laptop to work with any of the big models. When it started swapping just doing the preprocessing for one of the pretrained models [2], I kind of abandoned that project until I bother to get new hardware.

For that reason, I can't tell how good the pretrained models really are.

[1] http://kaldi-asr.org/doc/kaldi_for_dummies.html [2] http://kaldi-asr.org/models.html