Hacker News new | ask | show | jobs
Show HN: Costile – open-source proxy, blocks AI API requests when budget is hit (costile.com)
2 points by Mkiza 63 days ago
I got a surprise bill. Nothing catastrophic, but enough to make me dig into why — an agent had hit a retry loop and kept calling the API for hours. There's no way to set a hard cap on the Anthropic or OpenAI APIs. You can get an email after the fact, but nothing that actually stops requests mid-flight.

So I built a proxy. You swap one environment variable, it routes through Costile instead of calling Anthropic directly, and when you hit your daily or monthly limit it blocks further requests immediately. No SDK changes, no code refactor. Took me about a weekend. Currently supports Anthropic, with OpenAI next.

It's MIT licensed and self-hostable in about 5 minutes. Try the demo at costile.com if you want to poke at it.

I've got anomaly detection on the roadmap, but I'm second-guessing the scope — is surfacing cost spikes enough, or do people actually need to know why the agent went off the rails? The former is straightforward to build, the latter is a much harder problem. Curious where others would draw that line.

GitHub: https://github.com/Mkiza/ai-agent-cost

2 comments

Hey Mikza, great initiative indeed. I think Edgee.ai cover your cost control issue + they offer token compression. Have a look, it might solve your problem. LMK :-)
You can do this with OpenRouter though, can't you? They have a markup which is annoying, but they also have a long list of LLMs
True, OpenRouter covers the basics, but to my understanding you're paying a markup on every token (correct me if I'm wrong) — for high-volume setups that gets painful fast. And if you're calling Anthropic directly (common for compliance or reliability reasons), OpenRouter isn't really an option anyways. That said, if you're already on OpenRouter and the markup doesn't bother you, it probably does the job.