Y
Hacker News
new
|
ask
|
show
|
jobs
by
brianwmunz
21 days ago
"LLM evals" is maybe an overused term because it can mean a bunch of things. This article talks about LLM-as-a-judge where an LLM scores another system's outputs.