|
|
|
|
|
by brandall10
407 days ago
|
|
That rule of thumb is only related to 8 bit quants at low context. The default for ollama is 4 bit, which puts it roughly about 14GB. The vast majority of people run between 4-6 bit depending on system capability. The extra accuracy above 6 tends to not be worth it relative to the performance hit. |
|