Tell HN: A company was billed $128K from one leaked GCP API key

Y	Hacker News new \| ask \| show \| jobs

8 points by daudmalik06 134 days ago

I've been collecting real incidents over the past few months. The pattern is always identical:

- $128,000 — small company in Japan, caught it at $44K, shut everything down, charges kept accumulating, Google denied the adjustment request

- $82,314 — 3-person startup, Gemini API key silently reused, normal monthly spend was $180

- $55,444 — student, key leaked on GitHub during summer break, discovered months later

- ~$75,000 — student in India building a blood cancer detection tool, received legal threats from Google

The root cause in every case: GCP billing data lags 4-12 hours. By the time a budget alert fires, the damage is already done. There is no native automatic kill switch.

The only reliable real-time signal is `serviceruntime.googleapis.com/api/request_count` via Cloud Monitoring, which has a 3-5 minute ingestion delay. Budget alerts don't use this — they use billing data.

Has anyone else dealt with this? Curious how teams are protecting themselves today.

2 comments

JasperNoboxdev 132 days ago

We've tried a bunch of approaches, always comes down to the same few things:

- Built internal tooling to keep keys out of AI chats and anywhere they could leak. The moment a raw key enters a conversation or a shared space, you've lost control of it.

- LLM gateways with capped virtual keys per developer and separate service accounts. If a key leaks, it's easy to kill, doesn't affect the product, and the damage is capped — not your whole billing account.

- A scoped intermediary layer for any autonomous agents. Anything running without a human in the loop gets its own access that we can kill in seconds.

We ended up building some custom tooling here specifically for working with AI agents. There's always this tension between the easy way (just paste it into the chat, it'll be fine) and the proper way, which usually ends up being too cumbersome for anyone to actually follow.

link

daudmalik06 128 days ago

The approaches make sense for teams with engineering resources to build internal tooling. The LLM gateway layer is smart — virtual keys with caps is exactly the right mental model. The hard part is most solo devs and small teams never get around to building that layer, which is where the incidents happen. We built CloudSentinel specifically for that gap — automatic revocation on raw request count, no internal tooling required. Happy to share more if useful.

link

steebchen 132 days ago

LLMGateway helps with this -- you can set spend caps on api keys basically, and monitor usage in real time

link