Y
Hacker News
new
|
ask
|
show
|
jobs
by
vlovich123
450 days ago
Ah OK. So this is for resuming chat context cheaply. What I said is still correct - 3FS is not part of the inference flow & not relevant to the paper which is about optimizing the KV cache usage at runtime.