Hacker News new | ask | show | jobs
by alpe 3382 days ago
Another possibility is to just run an automatic speech recognition system (e.g. Sphinx or PocketSphinx can read from the mic input), and align its output with the ground truth text.

You need to deal with imperfect matching because the ASR might produce a text slightly different from the ground truth, but if you want to chunk e.g. at sentence granularity (and then move on to the next sentence), you should be able to do it in real time.