|
|
|
|
|
by zwaps
1496 days ago
|
|
I struggle to understand precisely what you mean by user experience and ‘often improved performance‘. Previously, there was no actual support for crucial features of the TPU related to data loading when using PyTorch, say. In turn, using a TPU over a GPU on that setup was frequently not worth it due to that exact issue. Your answer suggests it might be different now: are TF, Jax and PyTorch now on par in all stages? |
|
With TPU VMs, none of this is necessary. You can SSH directly into each TPU host machine and install arbitrary software on a VM there to handle data loading and other tasks with much greater flexibility.
The blog post provides an example of training cost improvement using PyTorch / XLA on TPU VMs in the "Local execution of input pipeline" section. Hopefully we will be able to provide more tutorials on using PyTorch / XLA with TPU VMs soon.
With TPU VMs, workloads that require lots of CPU-TPU communication can now do that communication locally instead of going over the network, which can improve performance.