| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by reliablereason 2 hours ago
	Is the thinking even done in real tokens? I thought it was done using the pure residual stream. That is instead of collapsing the residual stream to a token you treat the final layers output as a vector of size d_model and use that as input for the next position in the transformer. If that is the case thinking is not visible to us as users due to it not being done in text.

4 comments

TeMPOraL 1 hour ago

I saw that idea described as a step in AI 2027 (they call it "neuralese" and eyeballing the site, it's still labeled a hypothetical/future development), but AFAIK no one implemented/deployed this yet.

EDIT:

They link to a Meta paper from 2024/2025 though: https://arxiv.org/pdf/2412.06769/.

link

giancarlostoro 2 hours ago

Claude does all its thinking in text, its ChatGPT which does not do its reasoning in text. I believe its sort of implied / understood (?) that this is part of Claude's secret sauce over OpenAI. OpenAI will use less tokens, but Claude will be more correct, more of the time.

link

wqaatwt 2 hours ago

All open model that have reasoning seem to be doing it in text tokens. Is there any indication that closed models are approaching this somehow fundamentally differently?

link

throwuxiytayq 1 hour ago

That would be a huge deal, meaning we've lost even our shitty, ineffective ways of monitoring agent reasoning stream. Big setback when it comes to alignment and interpretability.

I don't know about Claude, but latest GPT versions still have a readable reasoning stream. It sometimes leaks out when the model gets confused, e.g., during a tool call. If you're curious, looks simplified; less words; extremely compact. They optimize tokens. But remain readable.

link