Hacker News new | ask | show | jobs
Show HN: Agent Caching in Fiddler (telerik.com)
2 points by zlatkov 97 days ago
Hi HN!

We’re the team behind Fiddler, and we recently built an Agent Cache feature.

It works by capturing an LLM request/response pair and, once caching is enabled, serving subsequent matching calls locally so duplicate requests never hit the provider.

It’s meant to reduce three pains that show up together in agent development: token cost, feedback-loop latency, and output non-determinism.

We’d love to hear your thoughts.

1 comments

Cool idea. I have had rather a bad experience with semantic caching. Do you have benchmarks that demonstrate the effectiveness?
This is dev‑time exact replay, not semantic caching. In early development, a lot of iteration seems to be about validating the flow rather than the quality of the model’s response.

Semantic caching feels more relevant later on, when reuse across similar inputs starts to matter. In dev-time context, an exact cache is often good enough. So that's what we looked to solve with Agent Cache.

I’m curious what's your experience with repeated llm calls during dev.