Hacker News new | ask | show | jobs
by skykooler 3773 days ago
Julius [1] is a pretty good offline speech recognition engine. In my tests it seems to have about 95% accuracy in grammar-based models, and it supports continuous dictation. There is also a decent Python module which supports Python 2, and Python 3 with a few tweaks.

HOWEVER:

The only continuous dictation models available for Julius are Japanese, as it is a Japanese project. This is mainly an issue of training data. The VoxForge models are working towards releasing one for English once they get 140 hours of training data (last time I checked they were around 130); but even so the quality is likely to be far less than commercial speech recognition products, which generally have thousands of hours of training.

[1] http://julius.osdn.jp/en_index.php

2 comments

Julius is my preferred speech recognition engine. I've built an application[0] which enables users to control their Linux desktops with their voices, and uses Julius to do the heavy lifting.

[0]: https://github.com/SacredData/COMPUTER

After a quick look, it seems Julius doesn't use the new deep-learning stuff?

In terms of data, http://www.openslr.org/12/ says it has 300 hours + of speech+text from librivox audiobooks. Using Librovox recordings seemed a great idea for making a freely available large dataset.