| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by suddenlybananas 124 days ago
	Why would they bother? Because it costs essentially nothing to add it to the training data. My point is that once a reasoning example becomes sufficiently viral, it ceases to be a good test because companies have a massive incentive to correct it. The fact some models got it right before (unreliably) doesn't mean they wouldn't want to ensure that the model gets it right.