Hacker News new | ask | show | jobs
by artpar 247 days ago
Everyone thinks of typical e-commerce pages when its comes "browser agent doing something", but our real use cases are far from shopping for the user. But your point still stands valid. The idea is that maybe there are websites where generating stable selectors/hierarchy maps wouldn't solve, but 80% (from 80-20) of websites are not like that (including a lot of internal dashboard/interfaces) (there will also be issues for websites with proper i18n implementations if the selectors are aria label based)

Self healing css selectors is also only 1 part of the story. The other part is the cohesive interface for the agent itself to use these selectors.

1 comments

> The other part is the cohesive interface for the agent itself to use these selectors

We are incubating this over at the WebMCP web standard proposal. You can see the current draft of explainer for the declarative API. https://github.com/webmachinelearning/webmcp/pull/26

Also, great work on the browser agent, this is the best of the DOM parsing/screenshot agents I've used. I was really impressed with the wordle example