|
|
|
|
|
by tarruda
930 days ago
|
|
> I can run it on my Macbook Air at 12tkps, can't wait to try this on my desktop. That seems kinda low, are you using Metal GPU acceleration with llama.cpp? I don't have a macbook, but saw some of the llama.cpp benchmarks that suggest it can reach close to 30tk/s with GPU acceleration. |
|
If anyone has faster than 12tkps on Air's let me know.
I'm using the LM Studio GUI over llama.cpp with the "Apple Metal GPU" option. Increasing CPU threads seemingly does nothing either without metal.
Ram usage hovers at 5.5GB with a q5_k_m of Mistral.