Hacker News new | ask | show | jobs
by wkat4242 319 days ago
Yes but you can quantise the KV cache too just like you can the weights.