Y
Hacker News
new
|
ask
|
show
|
jobs
by
tibbar
65 days ago
Doesn't context cacheing mostly eliminate this problem? (I suppose for enough context the 90% discount is eventually a lot anyway)
1 comments
zozbot234
65 days ago
Context caching is really storing the KV-cache for reuse. It saves running prefill for that part of the context, but tokens referencing that KV-cache will still cost more.
link