Hacker News new | ask | show | jobs
by trowngon 1937 days ago
Irrespective of the subject Deepspeech is very old archtecture with suboptimal results. You'd better try any recent conformer implementations (flashlight, nemo, wenet, etc) or wav2vec.
1 comments

I ended up with DeepSpeech since it was very easy to get started with, and it has support for fairly low-latency inferencing which is very important for my project.

I will take a look at the ones you suggested though, thanks for the heads-up!