| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ProofHouse 305 days ago
	Well they can be used together in some contexts so while they are different, you could also say RL can help Supervised Fine Tuning for further optimization