Hacker News new | ask | show | jobs
by zozbot234 80 days ago
KV quantization has long been available in llama.cpp
1 comments

Yes but the optimisation described has not right?