Hacker News new | ask | show | jobs
by int_19h 1131 days ago
I'm not sure about this model specifically, but training with 4-bit quantization has been a thing with LLaMA for a while now, although the setup involves manual hacks of various libraries.