| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by polishgladiator 1046 days ago
	I've been doing some hacking with Llama2 on an AMD 7900 XTX this weekend, using llama.cpp and q5_k_s quantization. Compared to MK600 on an RTX 4090 in their data, I am measuring higher throughput and lower perplexity (again, note that I am using a cheaper GPU!)...