| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by osmarks 727 days ago
	Mistral and Meta release "instruct" (RLHF) and not-instruct models. The non-instruct ones are in fact non-RLHF, pretraining-only ones (though they probably have ChatGPT-ish text in the dataset nowadays, and Meta might have done some extra training on evals...).