|
|
|
|
|
by jhonof
405 days ago
|
|
Not OP but for autocomplete I am running Qwen2.5-Coder-7B and I quantized it using Q2_K. I followed this guide: https://blog.steelph0enix.dev/posts/llama-cpp-guide/#quantiz... And I get fast enough autcomplete results for it to be useful. I have and NVIDIA 4060 RTX in a laptop with 8 gigs of dedicated memory that I use for it. I still use claude for chat (pair programming) though, and I don't really use agents. |
|