Hacker News new | ask | show | jobs
by vl 1611 days ago
Honestly, this single GPU-based install is child's play compared to Google's multiple TPU exoflop supercomputers with hyper-cube optical interconnects. Google's ML setups allow synchronous weight update on thousand+ TPUs...
4 comments

TPUs are amazing, but in my experience, debugging issues with them can be a bit tricky. Since nvidia's gpus are more common place (especially outside gcp), you can find a lot more information when you get stuck, it's also more battle tested, etc.
For what it's worth, jax is helpful to me here. You can drop out of the jit to debug it as if it were numpy.

Of course that assumes your issues aren't with the jit itself or inside pmap, etc. That shit's hard.

Tbh I thought I was being trolled with 'hyper-cube optical interconnects'.
Actually, you are right, I mistyped. Although hypercube interconnects exist, and were used, for example, in AS400, system in question uses hypertorus topology.
For what its worth, for attention based advertising (youtube and display, not search), FB targeting blows Google out of the water. Not sure why but its consistent across brands.
I have seen this myself, by I'm unsure if it's just a "ad quality" thing. For example, I can target exact placements on YouTube for my exact niche, and broad Facebook matching will outperform. I have tried YouTube and display for months with nothing within an order of magnitude as effective as Facebook.
for TPUv3 it's 2D torus, not hyper-cube, right? Not sure if TPUv4 topology is externally published, but IIRC hypercubes are basically never used any more.
I mistyped, one version is 2D torus, next is 3D torus aka hypertorus.