| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by m101 684 days ago
	Perhaps the smaller model used in o1 is over trained on arxiv and code relative to 4o (or undertrained on legal text)