| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by robertkarl 2 hours ago
	https://arxiv.org/abs/2606.00206 In this paper they nerf an LLMs ability to emit waffling thinking tokens like "wait", "but", "alternatively", and the models (they're old, small models in the paper) terminate reasoning faster and perform better. I bet Anthropic is tuning this on their backend.