Can this be implemented in current opensource models?
A other option is to ask GPT to compress your tokens into a shorter prompt for itself.
[0] https://www.theverge.com/2023/4/14/23683084/openai-gpt-5-rum...
Can this be implemented in current opensource models?