|
|
|
|
|
by ezoe
1482 days ago
|
|
CIFAR-10 is consists of 10,000 test images. So 0.03% of CIFAR-10 is 3 images. At this tiny number, the randomness is starting to affect the scores. Like labeling mistake of test data by human. Maybe, training SotA with different random seeds make its score 0.03% better or worse. Hell, 17,810 TPU core-hours is a huge number. You can't ignore the work of randomness. What if a cosmic ray hit a specific memory cell which cause the soft memory error, causing a single wrong calculation which ultimately cause the final trained model 0.03% difference? So, it's more like: "Jeff Dean spent enough money to feed a family of four for half a decade to get a 0.03% of winning lottery on CIFAR-10." |
|
In fact in a recent big paper from Google they mentioned that training occasionally went wonky in completely nonreproducible ways, but I am pretty sure I know what happened.