|
|
|
|
|
by bpires
3725 days ago
|
|
I don't think YOLO [0], the object detector he talked about, requires a massive amount of data as he claimed. Yes, if you want to learn how to classify 1000 different categories like on ImageNet, then yes, you need a lot of data. But if you're taking a pretrained network like YOLO (it was pretrained on ImageNet and trained on Pascal), you don't need a lot of images. I've retrained it with the KITTI dataset [1] and had no issues at all. They're only 7k images. By the way KITTI actually has a vehicles dataset that might be helpful for your case. And also by the way, you don't even need to retrain YOLO with your vehicle dataset. It was trained on Pascal VOC [2], a dataset of 20 categories and one of the categories is car. So YOLO already knows how to detect cars, it just might not be ideal for your dataset, but you don't care anyways since you just want any solution to compare to as a baseline. This would probably have been even less work than training the cascade classifier you used and have achieved better results. [0]: http://pjreddie.com/darknet/yolo/ [1]: http://www.cvlibs.net/datasets/kitti/ [2]: http://host.robots.ox.ac.uk/pascal/VOC/ |
|
I feel like all those deep learning papers distorted people's perception of scale. If you need to take those 7k images by hand because your application domain is obscure and they aren't available in an existing dataset, that's way beyond feasible.