Hacker News new | ask | show | jobs
by ajmurmann 5 days ago
It's true that the initial tool response still has the same amount of tokens but it doesn't keep dragged along in the longer-lived top context.
1 comments

Don't you resend after every turn, so splitting it avoids the n^2 token usage (granted it's cached so there's some optimal amount here)
Yes, exactly. You resend it on every turn (assuming no cache hits). This is why using the shorter-lived subagent to take in that context and only return the useful result back to the longer-lived context safes tokens.