|
|
|
|
|
by sillysaurusx
2171 days ago
|
|
We forward a cluster of 2,560 TPU pod cores from our GCE project to other GCE projects in europe-west4-a. Originally it was because we had a separate GCE project with a bunch of credits, but that project had no access to TPUs. The question was, could we still take advantage of the credits? It turns out, we could; the solution involved VPC Network Peering, which I later learned is how the TPUs themselves work. Some configuration details are here: https://www.shawwn.com/swarm#iptables Nowadays we forward the TPU pods to pretty much anyone who wants to try them out, in hopes of getting more people involved in the TPU programming scene. The TPUs are managed via a website (https://www.tensorfork.com/tpus) and we coordinate TPU access via spreadsheet. Each researcher has their own GCE project, and we simply flip a switch to give them access. If anyone reading this happens to be into ML and into programming for big hardware rigs, feel free to hop into the Tensorfork discord server and we can show you the ropes. https://github.com/shawwn/tpunicorn#ml-community |
|