|
|
|
|
|
by PoignardAzur
755 days ago
|
|
"Transformers Represent Belief State Geometry in their Residual Stream": https://www.lesswrong.com/posts/gTZ2SxesbHckJ3CkF/transforme... Basically finding that transformers don't just store a world-model as in "what does the world that produce the observed inputs look like?", they store a "Mixed-State Presentation", basically a weighted set of possible worlds that produce the observed inputs. |
|