| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by petesergeant 490 days ago

LLM as a judge isn't telling you if something is right or wrong, it's telling you if a given generation is normal or an aberration, according solely to the data the model was trained on. This is one reason LLMs prefer their answers to answers from other LLMs.

"LLMs as a judge" is more about addressing the failure mode of auto-regressive (one-token at a time) generation letting an LLM lead itself astray due to its previous choices, rather than telling you any general truth.

Finally I'd note that in every maths challenge I ever completed as a student, you were strongly advised to go back and check your own work at the end if you had time left over, and for me this usually led to me catching things I'd missed the first time.

1 comments

taurknaut 490 days ago

> Using LLMs to judge correctness

It seems their pr is willing to make much stronger claims than you will.

link

procaryote 489 days ago

PR's view of truth is "can I convince legal we probably won't lose if we're sued for saying this", which is a fairly weak form of truth

link