| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cubefox 1133 days ago
	There are two innovations: instruction fine-tuning (via supervised learning), which gives you a model which behaves as if it is in a dialogue (instead of predicting text) and, additionally, reinforcement learning from human feedback, such that it responds to the instructions in a certain way.