Y
Hacker News
new
|
ask
|
show
|
jobs
by
baruch
39 days ago
It is possible to get more tokens out of the same hardware by leveraging fast storage for KVCache, it is especially useful for agentic workloads.