Hacker News new | ask | show | jobs
by ggerganov 76 days ago
Better keep the KV cache in full precision
1 comments

Wow.. the GOAT himself.. thank you sooo much for creating llama.cpp ... will re-deploy with full kv cache once requests stop coming.