Hacker News new | ask | show | jobs
by bufo 1105 days ago
Yes, you are correct in that the ANE does have the equivalent of tensor cores and that I didn’t mention that. I just don’t expect it to be usable beyond inference because the number of compute units will not work for batches in medium/large/huge networks. That’s obviously by design! The ANE silicon size is tiny compared to the GPU area. I wouldn’t be actually surprised if Apple strategically only invests in using their GPU for LLM (1B+ params) work.

Note that if you are currently using CoreML for LLMs all the work is done in the GPU.