Hacker News new | ask | show | jobs
by abraxas 1177 days ago
Oh, Azure's speech recognition API beats it handily on English language. Both in accuracy and speed.

Another is Deepgram. Even this obscure vendor seems to be able to handle the samples I tried better than Whisper: https://picovoice.ai/platform/cat/

But yeah, go with Azure as your starting point. It is good and the price is likely acceptable unless you're transcribing all of youtube.

1 comments

Umm I want to pay zero and run locally
If you use the large_v2 version of whisper, and give it a prompt to indicate what it's transcribing, it can do extremely well. But do use the prompt feature.
Yeah exactly this is why there’s hype. It’s the best model that you can use for free easily