Hacker News new | ask | show | jobs
by ghaff 1488 days ago
Voice control is mostly pretty disappointing. Setting an alarm, playing an album, getting a quick weather forecast? OK.

But something more complicated where you really would like good voice control--like when driving--not so much. For example, with podcasts, I find I really need to pre-populate a playlist and by and large I find trying to totally control my phone by voice is very hit and miss.

Voice assistants have gotten marginally better over the years. But I really wouldn't miss them much if they all went away tomorrow. The vision was/is that they could match at least a marginally competent personal assistant over the phone. And they're nowhere even near the ballpark.

5 comments

> playing an album

Only if the album you want is titled in English (or some recognised language and with actual words).

I listen to a lot of music that have unpronounceable song and album titles. Hell, even artists, how could I ever tell a voice assistant to play "STRGTHS by SHXCXCHCXSH"? An extreme example but not too far from some of the top 10 recently played stuff on my Spotify: "sch.mefd 2" by Autechre, "JNSN CODE GL16 / spl47" an album/EP by the same Autechre, "Hygh 2k12" by SCNTST.

It's a technology on that uncanny valley of working and simplifying some use-cases, and frustrating enough for some edge cases that you end up not trusting it, in my case making me avoid it.

Even for some basic alarms/timers it can be frustrating when it misinterprets your accent and sets timers for 50 minutes instead of 15. The pain of having to fix the failure and then re-add a timer/alarm is enough to push me away.

I use Alexa's announce capability to tell my wife to answer the phone--but what I actually tell her to say is in her native language, not in English. The announcement works fine but my phone tries to render it as English text. Never the same thing twice.

And she definitely does the 50/15 thing to me, although generally the other way around. Everything from 30/13 to 90/19 is vulnerable to being misunderstood. My wife has a lot more trouble with it--she learned her first word of English at 43 and so she still has a fair accent.

In many cases it doesn’t matter even if the word is in English. When choosing an audiobook, I’ve had to make some leaps in phonetics to get Alexa to understand certain words in the title.
> Hey Siri, start the last podcast I was listening to in Overcast.

> I can't let you do that, Dave. Overcast has done its best to set up a shortcut, but you need to say exactly or else I'm punishing you with the enunciation of useless web searches.

> But something more complicated where you really would like good voice control--like driving--not so much.

I’ve tried Alexa, Google, and Siri multiple times over the years while driving and it’s just embarrassing how over hyped all of them are and yet simple questions which a human could potentially easily answer in seconds doing a search (if not driving of course), but none of them even get close.

- How far away is that storm?

- How many miles to the state line?

- What timezone is Omaha in?

- Where’s the closest gas station that has diesel?

- What’s the top rated BBQ place in town?

You can easily search how far away is that storm?

The others really ought to be voice searchable, but the diesel one would need to be timely in my area, where a couple of places have not had diesel available for several days.

>You can easily search how far away is that storm?

Pull up radar on Weather Underground (and/or look at the hourly forecast) and you should have a pretty good idea.

Whether or not that's the best example, the point is that there are a bunch of things I might want to know/do while driving that I can't look up without pulling off the road someplace. And even if I could theoretically look them up by voice, it would probably be an exercise in frustration to try to do so.

It's an interesting example because the question is very easy for a person to understand but giving an actual answer would take some time, but you can invert those for an AI.

But at least a human would give a reasonable answer, like "looks like at least thirty miles" or even just "I don't know". Your phone will instead say, oops, I didn't quite get that, try again later ( goodbye chime ). Which is terrible.

> You can easily search how far away is that storm?

Yes, I have two apps I use for that depending on what form of “how far away” I’m looking for, Dark Sky (time) and RadarScope (distance).

I tried using Google Assistant a few years ago and it was so frustrating in some dumb ways.

A notable example was when I tried using it to make a call. I told it to call my wife, by name, and it couldn't understand her name at all. So I said "call my wife" and it asked who my wife is. I couldn't answer with voice, because it still didn't understand her name. But it did give me a popup to select her from my address book. So I did and the popup went away... No call. So I tell it "call my wife" and it replies "who is your wife?".

I had the same problem funny enough! My wife has an Irish name that Google cannot pronounce (Alexa is much better in this one area) so I can't message her unless I try to match the mispronunciation (which I balk at out of self-respect) but it allows you to set nicknames for people. So I gave my wife, the nickname 'my wife'. I had to use the UI to do that, and even then, do I really trust Google to message the right person?
Is she confused because you're a polygamist? :)
> Voice control is mostly pretty disappointing.

That's because you interact with it as though it is a person, so your expectation levels are corresponding to the mode of communication used.

>That's because you interact with it as though it is a person

Well, that was sort of the pitch and it's certainly implied by "virtual/voice assistant." Certainly Amazon wasn't pitching Alexa as a voice-operated kitchen timer. To be honest, I'm probably better with them now because I know they mostly don't work but can be used for some simple tasks for which I know an incantation that mostly gives me the result I want.

Fair enough, but anybody that knows more about this stuff than your average consumer was likely quite skeptical of that pitch. All I saw was an always on microphone with a line to Apple, Google or Amazon and for me that was reason enough to bar that stuff from crossing the threshold at the front door here.
>anybody that knows more about this stuff than your average consumer was likely quite skeptical of that pitch.

That probably was--or should have been the case. But it's one of those things that seems like it would be pretty straightforward. After all, if a fairly young child can do something, it seems like a computer wired up to the Internet could. And in fact, voice recognition has gotten quite good--at least for English speakers without a strong accent. But actually carrying on a conversation in natural language is a really hard problem, even if children can do so from a fairly young age.

Yeah, but otter is friggin awesome. Could it have more commands in it? Like “email those items to me” or “make a to do list”