|
|
|
|
|
by usernametaken29
34 days ago
|
|
> δ-mem compresses past information into a fixed-size state matrix updated by delta-rule learning This doesn’t solve the capacity problem of memory. You can cram more into one context window, but then again you need to associate them with input queries. That’s very hard because slight variations in input create hugely different activations. So really, it doesn’t improve caching.
This paper might do a thing or two approximating the compression limit for context windows, but there’s a fundamental limit on how much information can go into it.
What you really need is contextual search, as in, different events and objects with the same abstractions and semantic lead to same response, so you can cache effectively… on this front the paper does little to improve “memory” in a meaningful way |
|
https://jdsemrau.substack.com/p/tokenmaxxing-and-optimizing-...