| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Remmy 1108 days ago
	I've been using llama.cpp with the python wrappers and it's the speed increase has been great, but it seemed to be limited to a max of 40 N_GPU_LAYERS. Going to have to update and see what sort of improvement I see.