|
|
|
|
|
by staticautomatic
622 days ago
|
|
I’ve been building a production app on top of ASR and find the range of models kind of bewildering compared to LLMs and video. The commercial offerings seem to be custom or built on top of Whisper or maybe nvidia canary/parakeet and then you have stuff like speechbrain that seems to run on top of lots of different open models for different tasks. Sometimes it’s genuinely hard to tell what’s a foundation model and what isn’t. Separately, I wonder if this is the model Speechmatics uses. |
|
Take a look. We'll be open sourcing more models very soon!