| > what is going on internally in the reasoning layer. We literally know exactly what is going on with every layer. It’s well defined. There are mathematical proofs for everything. Moreover it’s all machine instructions which can be observed. The emergent properties we see in LLMs are surprising and impressive, but not magic. Internally what is happening is a bunch of matrix multiplications. There’s no internal thought or process or anything like that. It’s all “just” math. To assume anything else is personification bias. To look at LLMs outputting text and a human writing text and think “oh these two things must be working in the same way” is just… not a very critical line of thought. |
Unless I missed a huge break in the observability problem, this isn't correct.
We know exactly how every layer is designed and we know how we functionally expect that to work. We don't know what actually happens in the model at time of inference.
I.e. we know what pieces were used to build the thing but when we actually use it its a black box - we only know inputs and outputs.