| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by junipertea 1248 days ago
	They also did reinforcement learning on top of a frozen trained model. It is considerably more than just attaching a UI as that would just finish sentences compared to answering questions. https://huggingface.co/blog/rlhf