Hacker News new | ask | show | jobs
by matwood 2742 days ago
> Do people really use voice commands for, well, anything?

Siri plus shortcuts have made many mundane tasks easier. When I get in my car to come home from work I say "Hey Siri, heading home." That causes my phone to text my wife my arrival time and starts the last podcast I had playing.

It's a simple thing, but is so much easier than texting and then thumbing through the podcast player to start where I left off. I have others like logging my water intake or weight, but it was really adding shortcuts to Siri that made these possible.

Playing music or TV shows is also much easier/nicer. "Hey Google, play The Office on Netflix".

Timers. Another simple thing that is so much easier when you can use your voice when cooking.

3 comments

I am skeptical when technologists say voice assisted systems will become the dominant interface at least in countries with a high rate of literacy.

I just look at TV vs radio, texting versus calling, or audio books versus written content. I believe most studies indicate that people are better at visual comprehension versus auditory:

https://news.nationalgeographic.com/news/2014/03/140312-audi...

I know scientists love working on voice and speech recognition, since it is a hard problem to solve, but it sometimes feels like its a bit of a solution in search of a problem. I'm sure there are good use cases, I'm just skeptical that they are profound enough for voice to be our primary medium for interaction.

More generally, I think the thing you are noticing is that visual and physical items offer random access.

Compare trying to find a specific piece of information in a book, vs in some training DVD.

If I'm just learning how to cook, watching a professional demonstrate the whole thing is going to be very helpful, but if I already know how to cook in general it's easier to flick to the right section of a book and scan the page for the bit of information I need.

Or compare the difference between listening to a phone system's 7 different options vs seeing all the options available on a single screen.

The other side of this is precision. Not only do input methods like a keyboard allow you to give extremely explicit, high information, instructions with no need for interpretation, they also have extremely fast feedback loops. Imagine trying to use your voice to click on a specific part of an image, or draw a circle around it. Far, far easier to move a pointer with your hand, watch where it goes, and then click when it's in the right position.

So visual comprehension probably is better than auditory, but I think the main things that are important are random access, specific and information dense input, and low latency feedback loops on input - all things that we are far better at achieving with physical/visual methods than auditory or speech based methods.

This is very well said and a great point. A lot of this relates to random access and which has an O(1) lookup. “Play season 2, episode 3” could be better as voice versus “if you want to reach reception, dial 1” is much better as an interface.
I agree with your skepticism about voice becoming generally dominant, but it’s already very useful. It may also become the dominant form of usage for some systems.
It's also hard to imagine sound as a dominant interface because all we have are mediocre examples. We have to work within clunky command boundaries, rephrase commands, be in a quiet environment, not have an accent, etc.

I'm glad we're making progress, but I'll be a skeptic until I can give voice requests as naturally as I'd give them to a human. IMO there's no limit from there.

I agree with your main point, but your examples seem suspect to me. TV is video _and_ audio, audiobooks are a translation of an existing artform (that is, books were originally created to be read, not listened to), and I find texting to be extremely clunky as a concept and do not enjoy tapping out long or interesting messages on a tiny touchscreen.
> Siri plus shortcuts have made many mundane tasks easier. When I get in my car to come home from work I say "Hey Siri, heading home." That causes my phone to text my wife my arrival time and starts the last podcast I had playing.

Wait, you do what?

How can I get SIRI to do this for me? Can you explain how you got SIRI to do that?

Along with iOS 12, Apple released a new app called Shortcuts. It is not preinstalled as far as I know. I think they got it via acquisition. There’s a bunch of new hooks that app makers can/must use to integrate with it, so not everything is supported.

It’s basically IFTTT for iOS, and you can assign phrases in Siri to it.

https://support.apple.com/guide/shortcuts/welcome/ios

I never got Siri to work that well. I found it's had problems calling and some of it's standard features. It's still on my iPhone, there's no way to remove, but I've found Google Assistant to work better. Plus, the Google Home integration is nice.
I use Siri for almost every task that doesn’t require me to physically look at my screen, e.g. reading, watching a video or typing. The #1 and #2 problems I have with Siri recognizing my input are in order: the quality of the mic I’m using and the ambient noise level in whatever environment I’m in. The #3 problem is the quality of my network connection because it won’t work if it can’t contact Apple.

In my experience, AirPods are the best Siri input device I own. EarPods are a distant second, and the built in mic on phone is a not very distant third. It is effectively unusable on my laptop’s built in mic.

The ambient noise basically means I can’t use Siri in noisy places, and I’m typically not inclined to. I might raise my voice a little if I’m putting in a podcast outside and it is windy.

#3 means disabling WiFi when I leave my house, until I’m in a location with a solid WiFi connection, in part because I make use of my cable WiFi. If I have no service and no WiFi then I have no Siri, not even to set a timer.

Beyond that, I find the basic feature set adequate, but not comprehensive. Siri shortcuts doubled Siri’s usefulness and I only use them for three or four apps.

That said, I appreciate Siri’s presence because it does enable me to leave my phone in my pocket a lot more than I used to, so there will occasionally be a week where I didn’t spend more than an hour or two looking at my phone’s screen the entire week (not per day, per week), with 90% of this time spent reading a book. Observing my friends’ obsessions and work habits, I appear to be the outlier in that regard.