| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mfa1999 122 days ago
	How does this compare to llama.cpp in terms of performance?

1 comments

MLX is a bit faster (low double digit percentage), but uses a bit more RAM. Worthwhile tradeoff for many.

On my M4 Pro MLX has almost 2x tok/s