|
|
|
|
|
by av_conk
500 days ago
|
|
I tried using ollama because I couldn't get ROCm working on my system with llama-cpp. Ollama bundles the ROCm libraries for you. I got around 50 tokens per second with that setup. I tried llama-cpp with the Vulkan backend and doubled the amount of tokens per second. I was under the impression ROCm is superior to Vulkan, so I was confused about the result. In any case, I've stuck with llama-cpp. |
|