Hacker News new | ask | show | jobs
by 2mlWQbCK 508 days ago
What benefit does Ollama (or RamaLama) offer over just plain llama.cpp or llamafile? The only thing I understand is that there is automatic downloading of models behind the scenes, but a big reason for me to want to use local models at all is that I want to to know exactly what files I use and keep them sorted and backed up properly, so a tool automatically downloading models and dumping in some cache directory just sounds annoying.
2 comments

IIRC it makes things a little easier, e.g. you don't need to specify a ClI flag to set how many layers to offload to GPU, and it provides an API that other programs on your system can use (e.g. openwebui).

It's been a while since I used llama.cpp directly, and I don't know whether I'm correct about its current scope.

RamaLama stands on the shoulders of giants by building upon llama.cpp (and other projects like minja, podman, vllm, etc.), we've been contributing back also Sergio Lopez, Michael Engel and I are contributing back to llama.cpp (just three examples of RamaLama people off the top of my head)

We write the higher level abstractions in python3 (with no dependancies on python libs outside of the standard library) because it's the heavy-lifting that needs to be done in C++. Python is a nice community friendly language also, many people know how to write it.