| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rahimnathwani 508 days ago
	IIRC it makes things a little easier, e.g. you don't need to specify a ClI flag to set how many layers to offload to GPU, and it provides an API that other programs on your system can use (e.g. openwebui). It's been a while since I used llama.cpp directly, and I don't know whether I'm correct about its current scope.