Hacker News new | ask | show | jobs
by ryao 386 days ago
I had been unaware of the others. Anyway, you need writes to the KV cache for every token generated. You are going to hit that fast.