| HN Mirror

Yeah now I get what you're saying. Yes the trace isn't what's actually happening. What's actually happening is just the attention mechanism etc. The model doesn't "think" in human language, it thinks in linear algebra. The thing is that before chain of thought it used to be necessary to get the model to output some language because that's the only thing it had to attach processing to (so if you wanted more processing you needed to get it to generate more text). Whereas now we get the model to generate some text that is a simulcrum on the thought that it might hypothetically be doing but in actual practise chain of thought is just something they get the model to do by training it in a certain way.