|
|
|
|
|
by Pinkert
115 days ago
|
|
That's actually a great question. and the answer is yes and no;
While it does disable the caching mechanism for the conversation history (and not for the system prompt, who remains constant), there is a difference between a chatbot with a constant chat history (just exchange of messages) and an agent who uses a large part of the conversation as a type of "scratchpad", sometimes even holding variables value in the beginning of the chat (to be sort of 'stateful'). if these variables change, the scratchpad changes (can be even 30%-40% of the entire conversation), there is a timeout in the cache (Claude gives you 5 minutes of cache for normal caching) or any other change to the exact history - you get a recaching of the entire conversation. additionally, caching still costs money. The main advantage of the librarian is that is an 'insurance policy' for this caching mechanism. combining it with solving the context rot issue - and you get improved performance at scale. |
|