|
|
|
|
|
by parched99
421 days ago
|
|
I am only able to get the Gemma-3-27b-it-qat-Q4_0.gguf (15.6GB) to run with a 100 token context size on a 5070 ti (16GB) using llamacpp. Prompt Tokens: 10 Time: 229.089 ms Speed: 43.7 t/s Generation Tokens: 41 Time: 959.412 ms Speed: 42.7 t/s |
|