|
|
|
|
|
by HarHarVeryFunny
236 days ago
|
|
Thanks, that's a useful way to think about it. Presumably the internal state at any given token position must also be encoding information specific to that position, as well as this evolving/current memory... So, can this be seen in the internal embeddings - are they composed of a position-dependent part that changes a lot between positions, and an evolving memory part that is largely similar between positions only changing slowly? Are there any papers or talks discussing this ? |
|