Hacker News new | ask | show | jobs
by neilv 1039 days ago
I was using various 4-bit quantized earlier, but decided to go back to 8-bit for 13B, since I had the VRAM anyway, and (at the time, for other reasons) was seeing some quirky behavior.

70B is currently 4-bit on this box, and once I have GPU accel for 70B, I'll see how the quality compares to 13B 8-bit.