Hacker News new | ask | show | jobs
by skykooler 3120 days ago
The problem with Kaldi is that it's virtually impossible to get a dictation model working with Kaldi unless you have a doctorate in speech recognition. There is no "I know basic programming, but little about speech recognition" documentation for Kaldi.
2 comments

Between learning curve and dependency hell, I've never managed good results with Kaldi, Simon, or Sphinx. It's unfortunate; hopefully we'll get an easy to use option soon.
When was the last time you tried sphinx? The library has changed a LOT. Their guides, new website and other resources basically walk you from zero knowledge to working demo.
On, I will have to try again:) Thanks for the tip; I always thought Sphinx should be ideal, it was just too much work to get it working.
There's the "Kaldi for Dummies" tutorial [1], which helped me to the point of creating a speech recognition program that could distinguish digits in recordings of my voice. I guess that's the documentation you're looking for.

My personal problem with Kaldi is that I don't have enough RAM in my cheap laptop to work with any of the big models. When it started swapping just doing the preprocessing for one of the pretrained models [2], I kind of abandoned that project until I bother to get new hardware.

For that reason, I can't tell how good the pretrained models really are.

[1] http://kaldi-asr.org/doc/kaldi_for_dummies.html [2] http://kaldi-asr.org/models.html