|
|
|
|
|
by daanzu
2284 days ago
|
|
I develop Kaldi Active Grammar [1], which is mainly intended for use with strict command grammars. Compared to normal language models, these can provide much better accuracy, assuming you can describe (and speak) your command structure exactly. (This is probably more acceptable for a voice assistant for an audience that is more technical.) The grammar can be specified by a FST, or you can use KaldiAG through Dragonfly, which allows you to specify them (and their resultant actions) in Python. However, KaldiAG can also do simple plain dictation if you want. KaldiAG has an English model available, but other models could be trained. Although you can't just drop in and use a standard Kaldi model with KaldiAG, the modifications required are fairly minimal and don't require any training or modification of its acoustic model. All recognition is performed locally and off line by default, but you can also selectively choose to do some recognition in the cloud, too. Kaldi generally performs at the state of art. As a hybrid engine, although training can be more complicated, it generally requires far less training data to achieve high accuracy, compared to "end to end" engines. [1] https://github.com/daanzu/kaldi-active-grammar |
|