|
|
|
|
|
by m_ke
1712 days ago
|
|
1. The small subset is to test that your training pipeline works and converges near 0 loss. 2. Sure, but for most new hacks like mixup, randaugment and etc the results usually transfer over. Problem with deep learning is that most of the new results don't replicate so it's good to have a way to quickly validate things. 3. The lower level features are usually pretty data agnostic and transfer well to new tasks. |
|