|
|
|
|
|
by Zondartul
817 days ago
|
|
The tokens are also necessary to store information, or at least off-load it from neuron activations. E.g. if you asked an LLM "think about X and then do Y", if the "think X" part is silent, the LLM has a high chance of: a) just not doing that, or b) thinking about it but then forgetting, because the capacity of 'RAM' or neuron activations is unknown but probably less than a few tokens. Actually, has anyone tried to measure how much non-context data (i.e. new data generated from context data) a LLM can keep "in memory" without writing it down? |
|