| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by senko 563 days ago
	Too late to edit, but here's a great, really in-depth post about using LLMs as judges to evaluate LLM outputs (when you don't have the ground truth for everything): https://cameronrwolfe.substack.com/p/finetuned-judge This is about finetuning LLMs to do it, but the first part is a good intro to why and how.