Y
Hacker News
new
|
ask
|
show
|
jobs
by
eeasss
219 days ago
Are there any llms in particular that work best with g-evals?
2 comments
lyuata
219 days ago
LLM Benchmark leaderboard for common evals sounds like a fun idea to me.
link
zlatkov
219 days ago
I haven’t come across any research showing that a specific LLM consistently outperforms others for this. It generally works best with strong reasoning models that produce consistent outputs.
link