| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by p-e-w 898 days ago
	AFAICT, Nitro is just a wrapper around llama.cpp. Therefore, you can simply look at llama.cpp benchmarks, of which there are plenty.

1 comments

UnlockedSecrets 898 days ago

Oobagooda and other front ends and similar projects have in my testing had upwards of a 50% difference in inference speed on the same model and settings, So benchmarks are still useful.

link

brucethemoose2 898 days ago

Ooba is an outlier, and has tons of overhead over llama.cpp and llama-cpp-python for some reason.

Most llama.cpp openai servers are pretty close to vanilla llama.cpp, albeit without the batching support.

link