Hacker News new | ask | show | jobs
by jaypeg25 1151 days ago
Siri, Alexa, and Google Assistant are all generally good at those things but honestly not much else. I use my Google Home every day for timers and to turn on/off lights. I don't think any of them will ever be the helpful AI that they were initially promoted as, though.
2 comments

I'm shocked how much the quality of Google Assistant has degraded over time.

I have a Mini that consistently misunderstands broadcast requests and says "sorry I'm not playing anything right now". When it occasionally speech-to-text converts a broadcast word, it consistently cuts off the first word or letter, even when it's "I'll be right down" users will get "L L be right down".

It used to support simple offline requests like SMS and Navigate when data was unavailable. No more.

It used to integrate with Google Keep. No more.

No longer recognizes the word "torch" as a synonym for flashlight. Why would I be asking to turn on my phone's "porch"?

Painfully slow replies, even under ideal network environment... Just spinning forever, often until a timeout that it doesn't even have the decency to respond with a proper error message.

It's just amazing how they launched a product with a clear "this is where we are, this is our vision of where we're going" and they still sell it but instead they're going in the opposite direction.

My pet theory is that they were launched by the A team, who got replaced by the B team when the A team left for greener pastures. Or maybe they got it working, but in a bid to get another promotion they kept tweaking it, making it worse in ways that matter to us but not the promotion committee.
See, at least Siri usually is ready to take your input the moment you press the button, even if it then casually discards your input because it can’t reach apple’s servers or whatever.

Google Maps: I swear, about half the time I try to activate voice search, it sits and spins before even accepting any voice input at all. Why can’t it just start reading the microphone right when I activate it, and then submit the saved audio whenever it’s done getting set up? It’s so abysmally poor that it’s usually faster to scroll through recent destinations or literally grab the phone, unlock, and put in a destination.

This is just a market begging to be disrupted. I want to see a startup combine Whisper, GPT, and a competent TTS model into a killer voice UI!

What flabbergasts me is how often the screen will display a successful speech-to-text capture, and then it poops the bed anyways. Like, you did it! You did the hard part! The part that feels like goddamned magic to me, converting the noisy messy reality of sound-waves into text. And then it drops the ball on the simple pile of "if" statements it takes to convert that text into an action.
>I don't think any of them will ever be the helpful AI that they were initially promoted as, though.

I must disagree. ChatGPT-style LLM functionality with ElevenLabs-quality realtime voice synthesis will absolutely supercharge these products. The ability to e.g. answer kids' questions in simplified English according to parental prompt guidelines, or drill down on complex educational topics, or maintain context over many back-and-forth conversational interactions will be huge.

Strongly agree, tons of educational and entertainment value will be unlocked.
I'm really excited for LLM D&D dungeon masters.