| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by iamjackg 152 days ago

It's not unsolved, at least not the first part of your question. In fact it is a feature offered by all main LLM providers!

- https://platform.openai.com/docs/guides/prompt-caching

- https://platform.claude.com/docs/en/build-with-claude/prompt...

- https://ai.google.dev/gemini-api/docs/caching

2 comments

imiric 152 days ago

Ah, that's good to know, thanks.

But then why is there compounding token usage in the article's trivial solution? Is it just a matter of using the cache correctly?

link

StevenWaterman 152 days ago

Cached tokens are cheaper (90% discount ish) but not free

link

moyix 152 days ago

Also, unlike OpenAI, Anthropic's prompt caching is explicit (you set up to 4 cache "breakpoints"), meaning if you don't implement caching then you don't benefit from it.

link

netcraft 152 days ago

thats a very generous way of putting it. Anthropic's prompt caching is actively hostile and very difficult to implement properly.

link

igravious 152 days ago

dumb question, but is prompt caching available to Claude Code … ?

link

stavros 151 days ago

If you're using the API, yes. If you have a subscription, you don't care, as you aren't billed per prompt (you just have a limit).

link