Hacker News new | ask | show | jobs
by uncanneyvalley 1218 days ago
It’s not that your audio is being amplified, it’s that the VAD classifier is poorly tuned. The noise should never even reach the recognition stage. Whisper’s hallucinations are pretty severe, but are improved by adding VAD to its pipeline.