| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mark_l_watson 941 days ago
	Another data point: I can (barely) run a 30B 4 bit quantized model on a Mac Mini with 32G on chip memory but it runs slowly (a little less than 10 tokens/second). 13B and 7B models run easily and much faster.