| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hadlock 410 days ago
	deepseek-r1:8b screams on my 12gb gpu. gemma3:12b-it-qat runs just fine, a little faster than I can read. Once you exceed GPU ram it offloads a lot of the model to the CPU and splitting between gpu and cpu is dramatically (80? 95%?) slower