Hacker News new | ask | show | jobs
by ipaddr 586 days ago
What model are you running? I found it to be too slow.
1 comments

SmolLM v2 1.7GB Q8 has been running pretty fast. Only needs about 3.5GB of memory. I think it's fine for chatting, but it's only really good for Python code compared to larger models.

Deepseek-coder is pretty slow, even with the quantized models.

https://huggingface.co/blog/smollm