Hacker News new | ask | show | jobs
by aurareturn 283 days ago
Neural Engine is optimized for power efficiency, not performance.

Look for Apple to add matmul acceleration into the GPU instead. Thats how to truly speed up local LLMs.