| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by andai 30 days ago
	Didn't multiple studies find the reasoning traces didn't have much to do with the final output? And even that outputting placeholder tokens during reasoning has a similar beneficial effect on benchmark scores? (I don't think that's the full picture but, there's definitely something fishy going on there.)

1 comments

tensegrist 29 days ago

reasoning itself just affords the model a ton of extra forward passes / "time to think"

the, como se dice, "misalignment" between the content of reasoning tokens and the actual output following the end of the reasoning is a separate problem, extensively studied by e.g. Anthropic

link