Hacker News new | ask | show | jobs
by jerrygenser 684 days ago
Olamma currently has only one "supported backend" which is llama.cpp. It enables downloading and running models on CPU. And might have more mature server.

This allows running models on GPU as well.

2 comments

I have been running Ollama on AMD GPUs (which support for came after NVIDIA GPUs) since February. Llama.cpp has supported it even longer.
How well does it run in AMD GPUs these days compared to Nvidia or Apple silicon?

I've been considering buying one of those powerful Ryzen mini PCs to use as an LLM server in my LAN, but I've read before that the AMD backend (ROCm IIRC) is kinda buggy

I have an RTX 7900 XTX and never had AMD specific issues, except that I needed to set some environment variable.

But it seems like integrated GPUs are not supported

https://github.com/ollama/ollama/issues/2637

Not sure about Ollama, but llama.cpp supports vulkan for GPU computing.
Ollama runs on GPUs just fine - on Macs, at least.
Forks fine on Windows with an AMD 7600XT
I use it in Ubuntu and works fine too.
it runs on GPUs everywhere. On Linux, on Windows...