|
|
|
|
|
by loehnsberg
62 days ago
|
|
I think if we want to build on what we have, instead of compaction at the end of the context window, the LLM would have to 'sleep', i.e. adjust its weights, then wake up with the last bits of the old context window in the new one, and have a 'feel' for what it did before through the change in weights. I just sense it's not that simple to get there, because simply updating the weights based on a single context sample risks degrading the weights of the whole network. I like the idea of using small local model (or several) for tackling this problem, like low rank adaptation, but with current tech, I still have to piece this together or the small local models will forget old memories. |
|
It's not how an llm can work right now, it needs too much iterations & a much bigger dataset than what we can work with. A single time experiencing something and we can remember it. That's orders of magnitude more efficient than an LLM right now can achieve.