Hacker News new | ask | show | jobs
by andreabergonzi 90 days ago
Really appreciate this! This is one of the strongest framings I’ve seen of where this could go.

The semantic action point is exactly where I think the architecture wants to evolve: less “infer everything from the DOM,” more explicit app-level capabilities like opening, filtering, confirming, assigning, etc.

And I think you’re right on the temporary / situational accessibility angle too. I didn’t start from accessibility, but the more I build this, the more it feels like a natural fit for those cases because it removes the need to install a separate assistive stack.

Head nod as a simple yes/no is also a very interesting idea. I probably wouldn’t start there before hardening the core loop, but it feels like a strong extension once the underlying interaction model is solid.

1 comments

Really excited to see where this goes.

Oh one other idea that popped into my mind is getting facial and vocal emotion data to help drive supportive interactions. One thing that is lost in lot of these tools is guiding folks when the action taken isn't one that is expected. I think back to when I was trying to get Google Assistant to play a particular song but was getting it wrong (I actually had the title wrong but I didn't know that then) I asked it 4-5 times with my tone getting more and more frustrated and it just continued playing the same song. If it knew I was getting frustrated it could have went "Sounds like i'm not getting the right song, can you hum the tune or say some of the lyrics".

That’s a really good point. I think the deeper problem there is not just understanding intent, but knowing when to stop confidently executing and switch into a better recovery mode. Thanks for the very useful feedback! I'll get back to working