Hacker News new | ask | show | jobs
by 3abiton 914 days ago
That's 10x speed increase. What's the secret behind apple M3? Faster clocked RAMs? Specific AI hardware?
1 comments

Unified memory and optimizations in llama.cpp (which Ollama wraps).
Is that using the GPU?
It can be variably configured. There are details in the repo, but llama.cpp makes use of Metal.