| I wrote a long practical guide on image augmentation based on ~10 years of training computer vision models and ~7 years maintaining Albumentations. Despite augmentation being used everywhere, most discussions are still very surface-level (“flip, rotate, color jitter”). In this article I tried to go deeper and explain: • The *two regimes of augmentation*:
– in-distribution augmentation (simulate real variation)
– out-of-distribution augmentation (regularization) • Why *unrealistic augmentations can actually improve generalization* • How augmentation relates to the *manifold hypothesis* • When and why *Test-Time Augmentation (TTA)* helps • Common *failure modes* (label corruption, over-augmentation) • How to design a *baseline augmentation policy that actually works* The guide is long but very practical — it includes concrete pipelines, examples, and debugging strategies. Would love feedback from people working on real CV systems. Link:
https://medium.com/data-science-collective/what-is-image-aug... |