| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by PhilippGille 639 days ago
	Yes I'm aware. I was contrasting the general use of an inference server vs calling llama.cpp directly (not via HTTP request). And among servers Ollama seems to be more popular, so it's worth mentioning when talking about support for local LLMs.