Hacker News new | ask | show | jobs
by pheug 2420 days ago
Actually it parallelizes extremely well, so that large companies are able to create monster models like mentioned in the article in the first place by just throwing money at the problem with TPUs and similar highly parallelized accelerators. It just doesn't lend itself well to distributed computing due to e.g. throughput requirements.
1 comments

That's just vertical scale. Distributed is what I was referring to. See comment below.