|
|
|
|
|
by sdenton4
2976 days ago
|
|
"As shown above, the top-1 accuracy after 90 epochs for the TPU implementation is 0.7% better. This may seem minor, but making improvements at this already very high level is extremely difficult and, depending on the application, such small improvements may make a big difference in the end." Any idea of how much variation in accuracy you get on different training runs of the same model on the same hardware? My understanding is that model quality can and does vary from one run to the next on these kinds of large datasets - from a single observation, it's hard to know if the difference is real or noise. |
|