| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by choppaface 2337 days ago

A single TPU v3 has 8 cores, so that’s 128GB memory total, which is more than any single GPU currently.

The TPU software does data parallelism (in Tensorflow) transparently, and it’s somewhat easier to do model parallelism because the memory link is solid and requires no special setup / drivers. You’ll still get an OOM from XLA if you have a tensor that won’t fit in the 16GB of a single core.

TPU pods are easier to use than clusters of infiniband-linked volta boxes. For TPUs you just give GCE money and make some small changes to your use of the TPU API. For the volta cluster you’d probably need to bring your own orchestration (e.g. Horovod). So a TPU pod is easier for one person to use and admin currently.