| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Der_Einzige 605 days ago
	Not even close. Llama.cpp isn't even close to a production ready LLM inference engine, and it runs overwhelmingly faster when using CUDA