| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Eisenstein 783 days ago
	"The quality of the prompts used in SFT and the preference rankings used in PPO and DPO played a crucial role in the performance of the aligned models. Meta's team carefully curated this data and performed multiple rounds of quality assurance on annotations provided by human annotators." * https://www.unite.ai/everything-you-need-to-know-about-llama...