| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by asimovDev 105 days ago
	I am running 80b Qwen coder next 4bit quant MLX version on a 96GB M3 MacBook and it responds quickly, almost immediately. I can fit the model + 128k context comfortably into the memory