| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by johntb86 510 days ago
	I'd be curious what would happen if you SFTed a larger model with successful reasoning traces from the smaller model. Would it pick up the overall reasoning pattern, but be able to apply it to more cases?