| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by HarHarVeryFunny 168 days ago
	The entire history of RL-trained "reasoning models" from o1 to DeepSeek_R1 is basically just a year old!