Hacker News new | ask | show | jobs
by leetharris 622 days ago
We released a new SOTA ASR as open source just a couple of weeks ago. https://www.rev.com/blog/speech-to-text-technology/introduci...

Take a look. We'll be open sourcing more models very soon!

3 comments

> These models are accessible under a non-commercial license.

That is not open source.

Exactly. It is source available but not open source:

https://opensource.org/osd

that's great to hear! amazing performance of the model!

for voice chat bots, however, shorter input utterances are a norm (anywhere from 1-10 sec), with lots of silence in between, so this limitation is a bit sad:

> On the Gigaspeech test suite, Rev’s research model is worse than other open-source models. The average segment length of this corpus is 5.7 seconds; these short segments are not a good match for the design of Rev’s model. These results demonstrate that despite its strong performance on long-form tests, Rev is not the best candidate for short-form recognition applications like voice search.

I'll check it out.

FWIW, in terms of benchmarking, I'm more interested in benchmarks against Gladia, Deepgram, Pyannote, and Speechmatics than whatever is built into the hyperscaler platforms. But I end up doing my own anyway so whatevs.

Also, you guys need any training data? I have >10K hrs of conversational iso-audio :)