|
|
|
|
|
by orost
1115 days ago
|
|
Quantization isn't (and wasn't) expensive, it's mostly just data shuffling. A good PC will do a 7B model in half a minute, up to a few minutes for a larger model. Quantized models being made available for download is more for the benefit of less technical users who may not be comfortable with the command-line tools, or for people with slow or metered connections who'd much rather download 15GB of data than download 60 only to squish it into 15. |
|