Hacker News new | ask | show | jobs
by agrnet 256 days ago
Atleast in my industry (highly regulated), I think it would be better if these agentic e2e tools output playwright code instead of keeping it all under the hood, as no risk averse regulated company will use a QA agent which could be nondeterministic when re running the same test
1 comments

As I mentioned above, a playwright won’t make the cut for many of the serious test cases we’ve seen, you need a whole system that ensures your tests are run and improved immediately. We created this project in a way that supports on-premise deployments, but you’ll need to run the whole engine and eventually use some SLMs/LLMs at different stages.
At the end of the day, is the LLM not just calling Playwright APIs? I’d rather have access to the final set of Playwright API steps that the LLM executed to accomplish a goal, rather than just hoping the LLM will choose the same actions again the second time i run it
We use PW for the interaction with the browser, but really how we represent what to do is in a custom format (could be executed in other frameworks too). So the PW we could generate would be a subset, where the more interesting parts (custom functions) are not really implemented in PW.

Also part of our format is specially finding deterministic way of running steps, with automatic healing when failed. And we also build the whole system in a way that is self-hostable, so in the cases you mention you could be able to have control over what is run and where.