Y
Hacker News
new
|
ask
|
show
|
jobs
by
YetAnotherNick
843 days ago
My calculation of kv cache gives 1GB per 3000 tokens for fp16. I am surprised openAI competitors haven't done this. This kind of features have not so niche uses, where prefix data could be cached.