Hacker News new | ask | show | jobs
by LoganDark 1108 days ago
> The model I have is q4_0 I think that's 4 bit quantized

That's correct, yeah. Q4_0 should be the smallest and fastest quantized model.

> I'm running in Windows using koboldcpp, maybe it's faster in Linux?

Possibly. You could try using WSL to test—I think both WSL1 and WSL2 are faster than Windows (but WSL1 should be faster than WSL2).

1 comments

I didn't know what WSL was, but now I do, thanks for the tip!