Hacker News new | ask | show | jobs
by a1j9o94 976 days ago
This is super interesting! I was thinking about how to approach a similar problem for a project I'm working on, and my approach is similar.

I am curious about the benefit of having the agent interact with the user (or doing the task) and managing its memory instead of having an observer agent that modifies the memory separately. The thought process is to let the agent use all of its tokens to focus on the task and not memory management.

1 comments

Explicit memory management (MemGPT-style) vs implicit/external memory management is an interesting tradeoff. Like you said, adding all the instructions on how to manage memory consumes ~1k tokens (using the default prompts on our MemGPT GitHub release), which is a lot when your context window is 8k. Additionally, it requires the base LLM to be very good at instruction following; gpt-4 can do it well, but it's much more difficult to get explicit memory management to work with gpt-3.5-turbo or llama2 70b finetunes (so to build a robust system, you may have to end up having to "split" the thinking out of necessity).

One of the main benefits of explicit memory management is simplicity - e.g., you don't have to manage logic between a "memory creation" thread and a "dialogue thread". The explicit approach also integrates well with the iterative paging/retrieval for document analysis we demo in the paper/on GitHub.