|
|
|
|
|
by sorenbouma
2111 days ago
|
|
I'm a TPU user and I'd be interested to see a specific example of something that can be done on TPU but not GPU. Perhaps I'm just not experienced enough with the programming model, but I've found them to be strictly less flexible/more tricky than GPUs, especially for things like conditional execution, multiple graphs, variable size inputs and custom ops. |
|
The central reason that TPUs feel less flexible is Google's awful mistake in encouraging everyone to use TPUEstimator as the One True API For Doing TPU Programming. Getting off that API was the single biggest boost to my TPU skills.
You can see an example of how to do that here: https://github.com/shawwn/ml-notes/blob/master/train_runner.... This is a repo that can train GPT-2 1.5B at 10 examples/sec on a TPUv3-8 (aka around 10k tokens/sec).
Happy to answer any specific questions or peek at codebases you're hoping to run on TPUs.