| This article is dead-on, but I think it is missing a fairly large segment of where ML is actually working well: anomaly detection and industrial defect detection. While I agree that everyone was shocked, myself included, when we saw how well SSD and YOLO worked, the last mile problem is stagnating. What I mean is: 7 years ago I wrote an image pipeline for a company using traditional AI methods. It was extremely challenging. When we saw SSDMobileNet do the same job 10x faster with a fraction of the code, our jaws dropped. Which is why the dev ship turned on a dime: there's something big in there. The industry is stagnated for exactly the reasons brought up: we don't know how to squeeze out the last mile problem because NNs are EFFING HARD and research is very math heavy: e.g., it cannot be hacked by a Zuck-type into a half-assed product overnight, it needs to be carefully researched for years. This makes programmers sad, because by nature we love to brute force trial-and error our code, and homey don't play that game with machine learning. However, places where it isn't stagnating are things like vibration and anomaly detection. This is a case where https://github.com/YumaKoizumi/ToyADMOS-dataset really shines because it adds something that didn't exist before, and it doesn't have to be 100% perfect: anything is better than nothing. At Embedded World last year I saw tons of FPGA solutions for rejecting parts on assembly lines. Since every object appears nearly in canonical form (good lighting, centered, homogeneous presentation), NN's are kicking ass bigtime in that space. It is important to remember Self-Driving Car Magic is just the consumer-facing hype machine. ML/NNs are working spectacularly well in some domains. |
However, you can make significant gains to your models by going back to traditional image filtering/augmentation. Sticking with well researched object detectors/segmentation algorithms and putting our effort on improving the algorithms that cleans up the data takes you far. It's impossible to avoid because images will always be full of reflections, artifacts, strange coloration unless you have the perfect lighting tunnel setup; doable nonetheless.