| HN Mirror

> We're exploring some graph re-ranking techniques to eliminate useless elements from the HTML DOM when analyzing the page.

Computer vision is useful and very quick, however, it has been my experience parsing stacking context is much more useful. The problem is creating a stacking context when a news site embeds a youtube or blusky post. It requires injecting script into each using playwright. (Not mine, but, prior art [0]).

I've been quietly solving a problem I encountered creating browser agents that didn't have a solution 2 years ago in my free time. Most webpages are several independent global execution contexts and I'm developing a coherent way to get them all to speak with each other. [1]

> "Go to Amazon.com and add an iPhone 16, a screen protector, and a case to cart"

Are you familiar with Google Dialogflow? [2] It is a service which returns an object with intent and parameters which make it is to map to automation actions. I asked GhatGPT to help with an example of how Dialogflow might handle your request. [3]

[0] https://github.com/andreadev-it/stacking-contexts-inspector

[1] https://news.ycombinator.com/item?id=42576240

[2] https://cloud.google.com/dialogflow/es/docs/intents-overview

[3] https://chatgpt.com/share/678ae18d-5370-8004-97d4-f9949887b0...