|
|
|
|
|
by antognini
2978 days ago
|
|
I've been running a lot of these resnet-50 experiments lately and the run-to-run variation is very small, on the order of 0.1%. It's actually pretty amazing how consistent training is given that the initialization is always different and the data is sampled differently on each run. (As an aside, it took us about three weeks to track down a bug that was causing the model to consistently reach an accuracy 1% lower than it was supposed to.) |
|
Tracking down bugs in convergence is really costly in these settings. We had a problem in pre-processing that took us quite a while to figure out...