|
|
|
|
|
by FieryTransition
777 days ago
|
|
Because the way they are quantized takes time to get bug-free when new architectures are released. If a model was quantized with a known bug in the quantizer, then it effectively makes those quantized versions buggy and they need to be requantized with a new version of llamacpp which has this fixed. |
|