| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by derbaum 412 days ago
	Very rough (!) napkin math: for a q8 model (almost lossless) you have parameters = VRAM requirement. For q4 with some performance loss it's roughly half. Then you add a little bit for the context window and overhead. So a 32B model q4 should run comfortably on 20-24 GB. Again, very rough numbers, there's calculators online.