Show HN: I built a proxy that cuts LLM costs 40-60% – no AI involved

How it works under the hood (since HN will ask): No LLM call, no summarization — purely deterministic.

Strips filler words ("basically", "essentially"), collapses verbose constructions ("in order to" → "to"), removes redundant connectors. Output is always a strict subset of the original — no words added, none moved.

On privacy, since it always comes up: your OpenAI/Claude keys never leave your app. You send us text → we return compressed text → you call your LLM yourself. We don't know which model you use or what you're building.

Real numbers across 2.4M+ calls: 42% average reduction. One beta user at 50k prompts/day saves $2,100/month.

For existing codebases, one line: from agentready import patch_openai Every OpenAI call gets compressed automatically. Zero other changes.

Free during beta, no card: https://agentready.cloud/hn

AMA on how the rule engine works, the edge cases are more interesting than you'd expect.