| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by npodbielski 1 day ago
	I bought r9700 for about 1700-1800$ and I have like 800t/s prompt and about 50t/s of inference on average? It hurt a bit when you change a prompt so llama.cpp have to discard entire cache and it have to think for 2-5min depending on the context, but otherwise it is faster than I can read.