Hacker News new | ask | show | jobs
by alexirobbins 974 days ago
Selectors have been our primary focus so far – they're notoriously finicky! Our roadmap includes more extensive use of AI, both as embedded intelligence, and in the code generation process. For example, one thing we've heard from heavy users of browser automation is that maintenance becomes the largest cost. Self-healing automations will be able to either fix themselves, our give you an alert with a suggested fix to work off of.
3 comments

The "self-healing" sounds very interesting. I've tried to think, myself, how to approach this in a chrome extension running dom selectors in automations. Curious if you have any high-level thoughts/findings in this area?
We're just getting started on it ourselves but it's a really fun problem. I think the useful thing from our findings so far is that simplifying the DOM representation really helps the model reason about state.
I'm confused the demo shows typing in to select a element in a row which looks to be AI, I don't see anything that looks to be AI in the selectors? I'm not even sure how you would work with selectors unless you put the whole html into the context window or just ask which locator looks most reliable?
That’s exactly what we do - we sample relevant parts of the DOM and use the model to write the logic for selecting that element. This works pretty well and saves a lot of time that developers otherwise spend inspecting the html structure to write the selectors themselves.

Going forward we’re excited to experiment with more intelligence at runtime e.g. using AI to try to recover if the selector wasn’t found.

So I assume the video is the ground-truth, then the AI has access to the DOM and the video, and generates a selector based on the video during the test run (each time) in order to do avoid flakiness due to DOM/class/attribute changes?
Right now the generated script is the ground truth but we’ve been working on augmenting this with images & videos to fall back on. We think defaulting to code is good because it is faster, cheaper and more easy to reason about in the 95%+ of times it works. Plain old Selenium will get you pretty far, especially if creating scripts is much easier.