Hacker News new | ask | show | jobs
by nneonneo 1151 days ago
See, at least Siri usually is ready to take your input the moment you press the button, even if it then casually discards your input because it can’t reach apple’s servers or whatever.

Google Maps: I swear, about half the time I try to activate voice search, it sits and spins before even accepting any voice input at all. Why can’t it just start reading the microphone right when I activate it, and then submit the saved audio whenever it’s done getting set up? It’s so abysmally poor that it’s usually faster to scroll through recent destinations or literally grab the phone, unlock, and put in a destination.

This is just a market begging to be disrupted. I want to see a startup combine Whisper, GPT, and a competent TTS model into a killer voice UI!

1 comments

What flabbergasts me is how often the screen will display a successful speech-to-text capture, and then it poops the bed anyways. Like, you did it! You did the hard part! The part that feels like goddamned magic to me, converting the noisy messy reality of sound-waves into text. And then it drops the ball on the simple pile of "if" statements it takes to convert that text into an action.