Hacker News new | ask | show | jobs
by slashcom 3033 days ago
Wait but, the batch size is 8x bigger for the TPU? That's not a fair comparison; increasing batch size always speeds things up...
2 comments

Author here.

Note that the TPU supports larger batch sizes because it has more RAM. We tested multiple batch sizes for GPUs and reported the fastest one. We'll try increasing the batch sizes as far as possible and report. The overall comparison will likely not change by much - we saw speed increases of around 5% doubling the batch size from 64 to 128. (https://www.tensorflow.org/performance/benchmarks also reports numbers for batch sizes of 32 and 64 on the P100)

Disclosure: I work on Google Cloud.

Oh! You should definitely say that. It's semi-reasonable then to choose the batch size that is optimal for the part. It'd be good to make sure this isn't why your LSTM didn't converge though...

I tested many different batch sizes for the LSTM, so I am pretty confident it's not the reason.
But typically does not have an impact on power usage.

They are claiming a 29x improvement in that area.