Hacker News new | ask | show | jobs
Show HN: Open-source testing framework for AI agents with semantic validation (github.com)
4 points by alessandro-a 252 days ago
Hey HN!

I built SemanticTest while working on calendar0.app (an AI calendar assistant).

While I was building the AI assistant, I noticed a lack on good AI Evals frameworks that would help me test my agent.

SemanticTest uses GPT-4 as a judge to evaluate:

- Text responses (semantic meaning)

- Tool calls (correct tools, right order)

- Multi-turn conversations

It's composable: you build tests as JSON pipelines using custom blocks.

Would love feedback. Thank you!

1 comments

Hey everyone, I just made it simpler to understand the library by creating a landing page www.semantictest.dev and proper documentation: docs.semantictest.dev.