Hacker News new | ask | show | jobs
Show HN: Verdict – model evals on your own data, not someone else's benchmark (github.com)
2 points by agunapal 42 days ago