Hacker News new | ask | show | jobs
by p1esk 785 days ago
I didn’t upvote it because I don’t use Ollama. To experiment with LLMs I use Huggingface. Does Ollama provide something I cannot get with Huggingface?
3 comments

Ollama provides a web server with API that just works out of the box, which is great when you want to integrate multiple applications (potentially distributed on smaller edge devices) with LLMs that run on a single beefy machine.

In my home I have a large gaming rig that sometimes runs Ollama+Open WebUI, then I also have a bunch of other services running on a smaller server and a Raspberry Pi which reach out to Ollama for their LLM inference needs.

Sure, maybe it’s better for niche use cases like yours.

HF is the biggest provider of llms, and I guess I haven’t run into it’s limitations yet.

Running locally is sometimes necessary, e.g. you don't want to send sensitive data to any random third party server.
Both Ollama and Huggingface distribute models. The latter sites have model hosting services too, but that isn't the only way to use models from there.
Hugging face is a model repository.

Ollama allows you to run those models.

Different things.

I run models using HF just fine. I mean I’m using HF transformers repo, which gets models from HF hub.

Or do you mean commercial deployment of models for inference?

Are you talking about the Hugging Face Python libraries, the Hugging Face hosted inference APIs, the Hugging Face web interfaces, the Hugging Face iPhone app, Hugging Face Spaces (hosted Docker environments with GPU access) or something else?
I updated my comment above: I’m using HF transformers repo, which gets models from HF hub.
Do you have an NVIDIA GPU? I have not had much luck with the transformers library on a Mac.
Of course. I thought Nvidia GPUs are pretty much a must have to play with DL models.