| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by irusensei 876 days ago
	Why not both? Llama.cpp allows layering GGUF models between GPU and CPU memory.