Hacker News new | ask | show | jobs
by bugglebeetle 914 days ago
Unified memory and optimizations in llama.cpp (which Ollama wraps).
1 comments

Is that using the GPU?
It can be variably configured. There are details in the repo, but llama.cpp makes use of Metal.