Hacker News new | ask | show | jobs
by pixelmelt 147 days ago
I would look into running a 4 bit quant using llama cpp (or any of its wrappers)