| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yobanate 914 days ago
	Can confirm. My M3 Max gets about 22t/s, putting the bottleneck BKAC.

1 comments

That's 10x speed increase. What's the secret behind apple M3? Faster clocked RAMs? Specific AI hardware?

Unified memory and optimizations in llama.cpp (which Ollama wraps).

Is that using the GPU?

It can be variably configured. There are details in the repo, but llama.cpp makes use of Metal.