Hacker News new | ask | show | jobs
by brody_hamer 523 days ago
They did.

> “ To address this limitation, we turned to data augmentation, artificially creating new versions of each image by modifying colors, adding noise, applying distortion, or rotating images. By the end, we had generated 600 augmented images per car.”

1 comments

Those are pretty standard. A standard YOLO training run applies more transformations than that, and there are ready-made modules that do the same in keras and pytorch (for their mobilenet and VGG16). I'm not sure if anyone is training any serious vision algorithm without that kind of data augmentation.

What I am talking about is that they want to recognize scenes containing the images, but only have the images as training data. They have a good idea what those scenes will look like. Going there to take actual training pictures was evidently not viable, but generating approximations of them might have been.