| HN Mirror

Nice - good to see activity in this space, the more folks are on it the quicker we'll get to a world with a satisfactory solution.

I like your approach - I went down a bit more of a swiss-army-knife one with a hybrid.

"Session memory" where turns are stored in full so that the full context can be retrieved if an important fact is missed in compaction. After a certain time sessions are compacted. I'm currently evaluating a model where only a certain number of turns per session are kept plain-text and compaction happens as a sliding window "knowledge memory" All info is periodically fed into a knowledge graph extraction and a knowledge graph is built and indexed "memory chunking" chunks of memory are stored individually into a vector space where they can be retrieved through similarity search as well as standard semantic searches.

My theory is that giving the agent flexibiity to query the tool best suited for it's current needs is the way to go, as agents/llms become better, they'll only get better at summarisation and tool choice.

Couple reasons I like storing more data over less is that storage is cheap and if compaction misses details, you can prompt your agent to go back to that session you had about X last week and extract that detail.

Similar to you however, it's still also very early stages and the ground is shifting rapidly. Claude Codes HTTP hooks are a great step towards a model that I believe can work assuming it gets adopted as a standard. A big problem of having memory be portable is that you gotta be able to plug it into any client you want, and for the time being, MCP + system prompt is the only option - and even that one's more than flaky.

Would be awesome to chat to someone from OpenAI or Anthropic on their take and pow-wow on options