|
|
|
|
|
by npodbielski
1 day ago
|
|
I bought r9700 for about 1700-1800$ and I have like 800t/s prompt and about 50t/s of inference on average? It hurt a bit when you change a prompt so llama.cpp have to discard entire cache and it have to think for 2-5min depending on the context, but otherwise it is faster than I can read. |
|