| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by elmarhaussmann 3080 days ago
	Author here. Note that the TPU supports larger batch sizes because it has more RAM. We tested multiple batch sizes for GPUs and reported the fastest one. We'll try increasing the batch sizes as far as possible and report. The overall comparison will likely not change by much - we saw speed increases of around 5% doubling the batch size from 64 to 128. (https://www.tensorflow.org/performance/benchmarks also reports numbers for batch sizes of 32 and 64 on the P100)

1 comments

boulos 3080 days ago

Disclosure: I work on Google Cloud.

Oh! You should definitely say that. It's semi-reasonable then to choose the batch size that is optimal for the part. It'd be good to make sure this isn't why your LSTM didn't converge though...

link

elmarhaussmann 3080 days ago

I tested many different batch sizes for the LSTM, so I am pretty confident it's not the reason.

link