Hacker News new | ask | show | jobs
by daanzu 2202 days ago
Just to be clear, the Dragonfly speech recognition command and control framework has multiple "backends" (speech recognition engines), including my Kaldi one. Probably the most used one currently is the Dragon Naturally Speaking backend.

The Kaldi engine, being developed primarily for research in speech recognition, can support a huge variety of "models". I think the consensus general best for most use cases (particularly for real time, low latency, streaming use) currently would be considered to be the "nnet3 chain" models, which are what my kaldi-active-grammar uses/supports.

1 comments

Thank you, I think I understand partially, but not fully, as I'm not very well versed in speech recognition software.

Basically, my question (and I assume many other users') is "I run <Linux/Windows/Mac OS>, what are my options and how good will my recognition be with each?". Your answer above helps, but it doesn't entirely satisfy me, as I'm not sure if a model is the recognition engine, or if the engine uses the model, or how I can use it, etc.