Hacker News new | ask | show | jobs
by GPUboy 654 days ago
Thanks for the feedback.

This is definitely an area we can improve, but we have a novel framework for testing and maintaining robustness. We use an LLM based testing loop to verify steps in the state machine, which is a chat interface that generates the agent from end-to-end then outputs a no-code graph. This testing loop allows soon support uploading loom videos to generate automations without installing things locally.

All agents have an API which directly returns results in a JSON, including the results of this testing loop. Check this image for an example: https://cdn.discordapp.com/attachments/1068385542875664424/1...

We also introduce a new robust future-proof targeting strategy that still works when services changes designs, because semantic targets like "Post button" will still work if the button changes colors or moves across this screen. This is a test that fails all current RPA tools including UIpath, so we have multiple paths to improve on the previous incumbents if we consider all the tools we have today with reasoning models.