Hacker News new | ask | show | jobs
by Gorgor 909 days ago
But no one is training these kinds of models on their personal device. You need compute clusters for that. And they will probably run Linux. I'd be surprised if Microsoft trains their large models in anything else than Linux clusters.
2 comments

> But no one is training these kinds of models on their personal device

on-device transfer learning/fine tuning is def a thing for privacy and data federation reasons. Part of the reason why model distillation was so hot a few years ago.

Apple used to sell servers. I don’t thing they should settle for “just use Linux” in such and important field.
Why does the OS matter for training models?

Apple would want to train models as fast as they could. Nvidia provides an off the shelf solution they can just buy and use for a very reasonable price and sell on the second hand market.

If they wanted to use their own hardware they would either need more of it, which would cost a lot and divert production from sellable devices; or they would need to make special chips with much bigger neural engines, which would cost even more.

Also Apple uses public clouds for service stuff. They may not even own any hardware and just be renting it from AWS/Azure/GCP for training.

I feel like I’ve answered the issues you raised in this thread already.
> Used to

Exactly, over a decade ago...