Hacker News new | ask | show | jobs
by thisisananth 2459 days ago
In a website or an app, there are specific affordances i.e buttons, dropdowns, gps and text boxes that bound the input and steer the user input to help the user achieve the task.

For the speakers like Alexa and Google Home, voice being the only input allows user to say whatever they want hence making the task space infinite. But the voice recognition and NLP is not in a place where it can recognize everything the user has said. This creates a less than stellar experience with the user having to repeat, rephrase or even worse abandon the task. I think this platform will blow up when NLP/AI is able to detect user intent with near perfect accuracy and is able to make the interaction with the user as fluid as with a well designed app. It doesn't hurt for Amazon to have a large installed base ready to use the platform if/when intent recognition becomes par.

Of course it will never replace phone/desktop as there will be things which we cannot say over voice (secrets) and where it is not possible (loud places) or just not courteous behavior.

1 comments

> This creates a less than stellar experience with the user having to repeat, rephrase or even worse abandon the task.

Not to mention: constant wondering whether the task can even be accomplished. When a voice assistant rejects your query, in many cases you can't be sure whether it's because it couldn't understand you, or because it can't possibly accept what you said as a valid input in the context it's in. In regular interfaces, visible constraints matter as much as affordances.

Norman would refer to these constraints as "signifiers", indicators of possible affordances. It's interesting how weak voice assistants are at signifying what you can actually do with them.
Thanks for introducing me to the term. Damn, I need to finally read that book.
The dev team can add helpful responses that signify to users the available set of voice commands for tasks it can complete based on keywords it can recognize from a user utterance or simply letting them know they didn't understand their response and they can get a list of actions spoken to them by asking for help. (I've worked on published Alexa skills for several large tech companies.)

I think a cool immersive middle ground will be smart surfaces embedded in wall materials that can display things and will simply list out all actions available or anthropomorphize the smart assistant as like a virtual servant that follows you around serving up facts and doing monotonous IoT actions for you.

Now the privacy and surveillance implications of something like this is another story...

> Now the privacy and surveillance implications of something like this is another story...

Those would be resolved here and elsewhere if the industry could be made to stop trying to own people's data. It's not the data that should be a commodity, it's software and clouds.