|
|
|
|
|
by elitan
778 days ago
|
|
I wouldn't be surprised if Region Blekinge were using something much worse and much more expensive than Whisper for their transcription. I've been transcribing A LOT of SR (Swedish Radio) shows as part of https://nyheter.sh/, and Whisper (self-hosted) has been very accurate. |
|
Whisper + an LLM can recover some of the gaps by filling in contextually plausible bits, but then it's not a transcript and may contain hallucinations.
There are alternatives that share Whisper internal states with an LLM to improve ASR, as well as approaches that sample N-best hypotheses from Whisper and fine-tune an LLM to distill the hypotheses into a single output. Haven't looked too much into these yet given how expensive each component is to run independently.