| Fair enough. Trying to keep it concise here - This is how you install it: pip install "headroom-ai[proxy]" headroom proxy --port 8787 It will: * Check all the data going into the LLM and apply intelligent compression based on the content type - different for JSONs, code etc. * If the LLM is not getting what it is seeking, there is reversible compression - so the LLM will not lose accuracy * When you think of MCP tools, code function calls etc. that fill up the context window and cause needle in haystack problems - they get eliminated. There is also an SDK which works like this: from langchain_openai import ChatOpenAI
from headroom.integrations import HeadroomChatModel # Wrap your model - that's it! llm = HeadroomChatModel(ChatOpenAI(model="gpt-4o")) # Use exactly like before
response = llm.invoke("Hello!") Ive personally used it with Claude Code and Cursor and seen the benefits. |