| HN Mirror

Fair enough. Trying to keep it concise here - This is how you install it:

pip install "headroom-ai[proxy]"

headroom proxy --port 8787

It will:

* Check all the data going into the LLM and apply intelligent compression based on the content type - different for JSONs, code etc.

* If the LLM is not getting what it is seeking, there is reversible compression - so the LLM will not lose accuracy

* When you think of MCP tools, code function calls etc. that fill up the context window and cause needle in haystack problems - they get eliminated.

There is also an SDK which works like this:

from langchain_openai import ChatOpenAI from headroom.integrations import HeadroomChatModel

# Wrap your model - that's it!

llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o"))

# Use exactly like before response = llm.invoke("Hello!")

Ive personally used it with Claude Code and Cursor and seen the benefits.