|
|
|
|
|
by natpalmer1776
67 days ago
|
|
The armchair ML engineer in me says our current context management approach is the issue. With a proper memory management system wired up to it’s own LLM-driven orchestrator, memories should be pulled in and pushed out between prompts, and ideally, in the middle of a “thinking” cycle. You can enhance this to be performant using vector databases and such but the core principle remains the same and is oft repeated by parents across the world: “Clean up your toys before you pull a new one out!” Also since I thought for another 30 seconds, the “too many memories!” Problem imo is the same problem as context management and compaction and requires the same approach: more AI telling AI what AI should be thinking about. De-rank “memories” in the context manager as irrelevant and don’t pass them to the outer context. If a memory is de-ranked often and not used enough it gets purged. |
|
ReadMe does support loading memories mid-reasoning! It is simply an agent reading files.
Although GPT-5.4 currently likes to explore a lot upfront, and only then responds. But that is more of a model behaviour (adjustable through prompting) rather than an architectural limitation.