Hacker News new | ask | show | jobs
by adrmtu 407 days ago
Cool project! How does the prefix cache work exactly? What’s your invalidation strategy when the page’s structure drifts (and how often do you refresh)? And how do you match an incoming question or task to the correct cached prefix? What criteria or fingerprinting logic do you use to ensure high hit rates without false positives
1 comments

Thank you! It's currently based on task lineage, exact match of task descriptions, and an optional user-provided cache_control argument that can control whether results or plans are cached.

One use-case for this is conversations: So for example if I invoke /chat/completions with [{"role": "user", "content": "Go to google.com"}] and later with [{"role": "user", "content": "Go to google.com"}, {"role": "user", "content": "Search for gorilla vs 100 human"}] then we cache the browser state from the first invocation so it can be quickly restored (or reuse the browser if not evicted).

Caching will get much more sophisticated in a future version, it's the piece we're most actively working on.