Y
Hacker News
new
|
ask
|
show
|
jobs
by
csomar
340 days ago
My understanding is that caching reduce computation but the whole input is still processed. I don’t think is fully disclosing how their cache works.
LLMs degrade with long input regardless of caching.