Hacker News new | ask | show | jobs
by zozbot234 65 days ago
Context caching is really storing the KV-cache for reuse. It saves running prefill for that part of the context, but tokens referencing that KV-cache will still cost more.