|
|
|
Ask HN: How are you developing and testing agents without burning tokens?
|
|
1 points
by zlatkov
158 days ago
|
|
Every agent run (and especially when running tests in CI/CD) adds token cost, nondeterminism, and slows iteration. I’m not talking about LLM evals here. I mean integration-style testing. There are some workarounds like VCR-style record/replay (e.g., Python's VCR “cassettes”) and similar ideas for capturing LLM calls. They can work for certain frameworks when your test/runtime is the one making the HTTP requests. Using local LLMs to save costs adds significant delay for each execution (+ doesn't solve nondeterminism). However, it becomes significantly more challenging for MCP-style flows, where the interaction can be IDE-driven (stdio/JSON-RPC) and the HTTP calls may not even originate from your code. I'm curious what people are using. |
|