Hacker News new | ask | show | jobs
by brianwmunz 21 days ago
"LLM evals" is maybe an overused term because it can mean a bunch of things. This article talks about LLM-as-a-judge where an LLM scores another system's outputs.