| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by remilouf 775 days ago
	LLM evaluations are very sensitive to the details of the prompt's structure. This post shows how using structured generation reduces the results' variance and the ranking shifts.