Show HN: Bridge your Claude/OpenAI subs into a team API with per-key cost caps

Y	Hacker News new \| ask \| show \| jobs

Show HN: Bridge your Claude/OpenAI subs into a team API with per-key cost caps (github.com)

1 points by shreyas8 115 days ago

Hey HN, I built this because I wanted to give my team access to Claude and GPT models for internal testing, but the official APIs have no per-key spending controls. You can't cap a key at $5/day or 100 requests/month — it's all or nothing. With non-technical team members in the mix (designers, PMs, QA), one forgotten loop or oversized prompt away from an ugly bill wasn't a risk I wanted to manage manually. Idea was to allow the members to test with these restricted API keys before using official keys.

So I built a bridge: it wraps the Claude Code CLI and Codex CLI behind an Express API, backed by existing Max/Pro subscriptions instead of per-token billing. Each team member gets their own API key with hard limits — requests/day, tokens/month, cost caps. Hit the limit and the key stops working. No surprises. An admin dashboard shows who's using what in real time.

Key features: - Two providers: /generate (Claude) and /generate-codex (Codex) - Per-user API keys with SHA-256 hashing (shown once, never stored raw) - Per-key hard limits with real-time tracking and enforcement - Admin dashboard for key management, usage monitoring, and request logs - Deploy on a $5 VPS behind Cloudflare Tunnel

What it's NOT: A production API replacement. It's for internal tooling and prototyping. CLI invocations add ~3-8s latency vs direct API calls.

Important: Wrapping CLI subscriptions behind a shared API may violate the Terms of Service of the underlying providers. Anthropic's Consumer ToS (updated Feb 2026) prohibits using subscription OAuth tokens in third-party tools, and OpenAI's ToS prohibits account sharing. Review the applicable terms before using this. See the Disclaimer section in the README for details.

Security was a focus: execFile (no shell injection), timing-safe auth, CSP/HSTS, input validation, rate limiting. Details in SECURITY.md.

Stack: Node.js, TypeScript, Express. No database — JSON files on disk.

GitHub: https://github.com/Shreyas-Dayal/ai-cli-bridge

Would love feedback on the approach and any security concerns I might have missed.

2 comments

AlexCalderAI 109 days ago

Nice approach to the per-key cost cap problem. We built something similar for tracking AI spend across providers - the "one forgotten loop" scenario is real and expensive.

The JSON-on-disk pattern works surprisingly well for this scale. We found the key insight is making costs visible in real-time rather than waiting for end-of-month bills. Even just seeing token counts per request changes behavior.

Curious if you've hit the CLI latency wall yet with concurrent users - that 3-8s overhead compounds fast with a team.

link

shreyas8 106 days ago

Spot on, realtime cost visibility changes behavior more than any hard limit does. On CLI latency with concurrent users,each request spawns its own process so they don't block each other, but yeah, 3-8s per call isn't great at scale. Works fine for a small team doing internal demo/testing, which is all this is meant for. It becomes very noticeable for chat-type functionality though that kind of latency ruins the conversational flow. Anything needing real-time responses or throughput should just use the official APIs directly.

link

othersidejann 115 days ago

Isn't this pretty clearly against Anthropic ToS?

link

shreyas8 114 days ago

Yup, it likely is — I state this out in the README disclaimer. Built it before Anthropic's Feb 2026 policy update, discovered the TOS issue while prepping to open-source.

Published it anyway since the per-key cost controls and usage tracking work just as well in front of official API keys. The CLI-subscription bit was the original motivation but not the only way to run it.

link