Ollama provides a web server with API that just works out of the box, which is great when you want to integrate multiple applications (potentially distributed on smaller edge devices) with LLMs that run on a single beefy machine.
In my home I have a large gaming rig that sometimes runs Ollama+Open WebUI, then I also have a bunch of other services running on a smaller server and a Raspberry Pi which reach out to Ollama for their LLM inference needs.
Are you talking about the Hugging Face Python libraries, the Hugging Face hosted inference APIs, the Hugging Face web interfaces, the Hugging Face iPhone app, Hugging Face Spaces (hosted Docker environments with GPU access) or something else?
In my home I have a large gaming rig that sometimes runs Ollama+Open WebUI, then I also have a bunch of other services running on a smaller server and a Raspberry Pi which reach out to Ollama for their LLM inference needs.