Hacker News new | ask | show | jobs
by chollida1 937 days ago
Ah, gotcha, so the fact that its 10,000 chips for one dedicated cluster that makes it large, as opposed to Azure which has an order of magnitude more GPUS but rents many of those out.
2 comments

High performance on a single task requires simultaneous computation and communication between nodes. If there's high latency between nodes, such as between nodes in different data centers, the communication costs can't be masked by computation.
I guess Azure's are spread out too. Latency higher to world wide datacentres.