Hacker News new | ask | show | jobs
by geor9e 215 days ago
Okay, but have you used the large Whisper model? Sure, voice typing has been around for 10 or 20 years. And it's great if you have a good mic and enunciate, but these new models are insane. You can just mumble something from across an entire room, with peanut butter in your mouth, and it won't miss a single word.
1 comments

It might make up a bunch of words, like "subtitles by soandso", when there's silence though... /s
Yeah you get "Like and Subscribe!" or "Thank you." or even chinese back from the API if you send pure silence (or I guess it's white noise to the model once its volume normalized). I think humans hallucinate in white noise or sensory deprivation too, maybe it's related.