|
|
|
|
|
by derbaum
412 days ago
|
|
Very rough (!) napkin math: for a q8 model (almost lossless) you have parameters = VRAM requirement. For q4 with some performance loss it's roughly half. Then you add a little bit for the context window and overhead. So a 32B model q4 should run comfortably on 20-24 GB. Again, very rough numbers, there's calculators online. |
|