|
|
|
|
|
by MacsHeadroom
1201 days ago
|
|
Read The Case for 4 Bit Precision. https://arxiv.org/abs/2212.09720 Spoiler: it's the parameter count. As parameter count goes up, but depth matters less. It just so happens that at around 10B+ parameters you can quantize down to 4bit with essentially no downsides. Models are that big now. So there's no need to waste RAM by having unnecessary precision for each parameter. |
|