| HN Mirror

SmolLM v2 1.7GB Q8 has been running pretty fast. Only needs about 3.5GB of memory. I think it's fine for chatting, but it's only really good for Python code compared to larger models.

Deepseek-coder is pretty slow, even with the quantized models.

https://huggingface.co/blog/smollm