| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by milansuk 639 days ago
	No need to use Ollama. LLama.cpp has its own OpenAI-compatible server[0] and it works great. [0] https://github.com/ggerganov/llama.cpp#web-server

2 comments

citizenpaul 638 days ago

Thanks didn't know that.

Do you happen to know the reason to use ollama rather than the built in server? How much work is required to get similar functionality? looks like just downloading the models? I find it odd that ollama took off so quickly if LLamma.cpp had the same built in functionality.

link

PhilippGille 639 days ago

Yes I'm aware. I was contrasting the general use of an inference server vs calling llama.cpp directly (not via HTTP request).

And among servers Ollama seems to be more popular, so it's worth mentioning when talking about support for local LLMs.

link