Hacker News new | ask | show | jobs
by feznyng 544 days ago
Besides the official docs you can check out llama.cpp as an example that uses metal for accelerated inference on Apple silicon.