Hacker News new | ask | show | jobs
by zdragnar 803 days ago
You're comparing a person recommending a single event versus an AI providing a list. In other words, proving OP's point.

GUIs provide information in 2D, letting eyes skim and bypass information that's not useful.

VUIs provide information in 1D, forcing you to take information in a linear stream and providing weak at best controls for skipping around without losing context or place.

Not coincidentally, this is why I absolutely hate watching videos for programming and news. If it's an article or post or thread, I can quickly evaluate if it has what I want and bypass the fluff. Videos simply suck at being usable, unless it's for something physical like how to work on a particular motor or carve a particular pattern into wood.

2 comments

The person you are replying to is arguing the opposite - a future VUI could know your interests and just read you out the one relevant event rather than reading a list.

Alternatively it could summarise into “there’s a standup comedy gig, a few bands and various classes - what sort of thing are you looking for?” and then discuss with you to find the 2-3 events that are most relevant rather than reel off one big list.

And if it were a visual display of those things, I would have honed in on what I wanted and gotten my answer in the time it took me to ask what I was looking for.

It may be fantastic as an aid for low and no sighted people, but so long as I can read, a VUI is strictly inferior.

Disagree - both can complement each other in different circumstances.

If I’m ironing and want to know when my first meeting is, chances are a VUI is better.

If I want to see my whole days itinerary, a GUI is probably better.

> a future VUI could know your interests and just read you out the one relevant event rather than reading a list.

This may require a level of accuracy and intelligence that is unobtainable to work.

I would still prefer it to send it to me on my mobile device display. Voice interaction is nice for accessibility, bu the first method of control (whatever it is) is faster.
The point I'm trying to make is that this thing we're calling a 'VUI' is shit. There's no reason speech has to be this boring one dimensional thing. It's like the people that designed these things have never had a real conversation in their lives. When you're speaking with another person, or multiple other people, you're constantly exchanging cues that allow the other person to understand and re-calibrate what they're saying. These are verbal sounds, non-verbal sounds and physical movements. A crinkle of the forehead, a shake of the head, an uttered 'aaaaah' or a quiet verbal affirmation in support of what's being stated. It's not a single uni-directional stream of information, it's a multi-directional stream coming from multiple multi-modal sources at the same time.

None of these basic realities are accounted for in current technology. Instead we have these dumb robot voices reading us results from a preprocessed script that it thinks answers our question. No wonder the monkey part of our brain immediately picks up on the fact that this whole facade isn't just a lie, but an excruciating lie. It's excruciating because it's immediately obvious that there's nothing else 'there' to interact with. Even when speaking to another person over the phone, there's a huge amount of nuance you can pick up on. Are they happy? Are they sad? Are they frazzled? Are they in a rush? Are they relaxed? And you automatically calibrate your responses and what you say in the conversation based on all of these perfectly obvious things. Normal humans automatically calibrate what they say, how they sound, what they suggest based on these cues. It works really well!

There's no reason voice stuff has to suck. It has worked pretty great for humans for thousands of years. We're evolutionarily tuned to it. It's just that all the technology we've created around it totally sucks and people are delusional if they think it's anywhere near prime time.