| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by novaomnidev 973 days ago
	Why is this faster than running llama.cpp main directly? I’m getting 7 tokens/ sec with this. But 2 with llama.cpp by itself