| HN Mirror

I feel like the others didn't understand the part where you weren't into AI lingo. You're correct, this is like having a memory for an LLM, because an LLM is stateless.

When you chat with an LLM, there's a concept called 'conext'. In essence, context is feeding all previous messages into the LLM together with your latest message. Because context is essentially a finite resource (it requires system memory and increases processing time) the bigger AI providers use tricks to compress context.

These providers usually also have 'memory', which in essence is just parts of previous chats that are entered into the current context based on their relevance. I don't know exactly how this works, but I'd imagine that it does some search for related chats and then adds summaries of those.

In essence, this tool allows you to do those things locally. This allows you much more control of what history the LLM gets and therefore the 'context' it works with. This is important, because context can get dirty. You can notice this if you're chatting with an LLM, it goes completely the wrong way, you try to get it on the right track again, but it just won't. That's because it just tries to predict the next word based on the full context and it might end up consistently predicting the wrong next word.