Hacker News new | ask | show | jobs
by bigyikes 653 days ago
Is there evidence that Apple is training a model large enough to require a huge amount of compute?
1 comments

https://arxiv.org/abs/2407.21075

AFM-server was trained on 8192 TPUv4 chips

Someone more versed can say if that is huge or not.

It is a far larger scale than most high performance clusters offer.