Hacker News new | ask | show | jobs
Show HN: NUA an agent that tests for product correctness (trynua.dev)
8 points by Paster335 20 days ago
We’ve been using background Claude loops a lot recently, and we would wake up to PRs that didn’t solve the problem we wanted, made on assumptions that were wrong. Furthermore, the tests that the agents wrote were usually tautological, and didn’t test for intent. We wanted an agent that took all the context a company has, and writes tests that check for product correctness as well.

For example, we work in reg tech, so bugs aren’t always technical. What we often see is things like insider trading alerts that should’ve fired that didn’t. We wanted an agent that turns laws and regulations into tests.

For now, users can upload PDF, MD, TXT, and DOCX files, but we’re planning integrations like Slack, Notion, Linear, and Zoom in the future.

We’re early on, so we would love to know what you all think!

2 comments

Looks cool. I've faced the tautological tests issue as well, so curious to see how you guys are solving that.

Where do the tests run? Is it testing the prod app?

Thanks for the feedback! When adding an app, you can point it to a staging url. Tests will run on your staging enviroment.

On the tautological tests piece we generate tests from yuor documentation not your code. This way, are tests are formatted as paths instead of a set of assertions. Furthemore, our tests are in natural language, and we have an agent that goes through the click paths and validates that each step is behaving as intended. This beats writing playwrights which often break when a major UI update happens.

This is sick, its like ci/cd for financial compliance
Yeah! One of our plans for the future is to integrate with GH actions so that our tests run on every pr.