Hacker News new | ask | show | jobs
by h1fra 2 hours ago
evals are glorified integration tests, would you invest in an integration test startup? absolutely not. I don't get why we are making all of this fuzz around evals
2 comments

Because what people actually want is a simple harness to test their use cases against all the frontier models and see which is the cheapest/best for the job.

It's simple to say but hard to master doing well, and the important thing is that no matter what tool you have the evals don't write themselves.

There are a number of integration test startups. None of them do a great job but they do exist.