| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nl 108 days ago
	Model distillation is very useful! Put it like this: Reinforcement Learning from Human Feedback (RLHF) is useful with hundreds of examples, and LLM distillation is basically the same thing.