Hacker News new | ask | show | jobs
Show HN: SemanticTest – Test AI agents with semantic validation (open source) (semantictest.dev)
1 points by alessandro-a 250 days ago
Hey everyone!

I've been building AI agents lately and kept running into the same problem: how do you test AI Agents?

I find that manually prompting the Agent for each release is tedious and not scalable. Also, existing solutions for testing agents are often complex to integrate.

To help with this I built a simple open-source testing framework that uses AI to validate AI: you define expected behavior and let an LLM judge if the output is semantically correct.

The LLMJudge returns a score (0-1) and reasoning for why it passed/failed.

You can try it live here (no signups): https://semantictest.dev

The playground runs real LLMJudge validation so you can see how the semantic testing works.

The code is completely open source and you can find extensive documentation here: https://docs.semantictest.dev

Would love feedback from you guys!

Thank you!