| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ckrapu 492 days ago
	My opinion is that opaque reasoning is a prerequisite for many of the worst possible AI outcomes. We should make reasoning fully visible in the output space.

5 comments

optimalsolver 492 days ago

Is there any actual evidence that the reasoning tokens output by current models actually represent the computation happening in the hidden layers?

In both cases, the model is doing a ton of processing that you can't actually inspect, except here, you at least get some efficiency gains.

Even more importantly, you're also less likely to convince yourself that you know what the model is thinking.

link

ckrapu 490 days ago

In the autoregressive decoding framework, the hidden layers' state for computation of token `t` is conditionally independent of all hidden states for `t-1`, `t-2` and so on given the observed tokens.

Put differently, the observed tokens are a bottleneck on the information that can be communicated across tokens. Any scheming performed by an LLM which requires more than one token to formulate must therefore pass through the visible tokens. With opaque vectors transferred across decoding steps, this is not the case.

The computation in the hidden layers, as far as we can tell, is not sufficient for scheming in a single decoding step. It looks like it requires O(10^2) or O(10^3) steps instead, judging from anecdotal evidence like the reports of scheming from o1 (https://cdn.openai.com/o1-system-card-20241205.pdf)

As far as your last point goes, I'd rather have a more transparent system, all other factors held constant.

link

anothermathbozo 492 days ago

No and we’ve observed evidence to the contrary

link

mola 492 days ago

Do you have some reading material on this? How did they understand the difference between stated cot and "actual processing"

link

miven 491 days ago

Chain of thought isn't exactly transparent either, you shouldn't fall into the pitfall of believing that the final sequence of tokens thinking about the task is the only processing the model actually performs during CoT.

There might me a lot of other hidden computations happening within the model's latents which may not immediately influence the predicted tokens but be relevant for the model's internal processing. And even disregarding that, the model is under no formal obligation to stick to the chain of thought it produced for its final decisions.

link

nsikorr 491 days ago

The paper suggests that that is still possible with the proposed architecture if needed.

link

DennisP 492 days ago

That actually sounds like it'd be really helpful.

link

Imanari 491 days ago

maybe let it reason in latent space but have a method to transform and output it to text for inspection.

link