|
|
|
|
|
by wiricon
812 days ago
|
|
How well does simulated data work in this space? My first stab at doing this scalably would be as follows: given a new product, physically obtain a single instance of the product (or ideally a 3d model, but seems like a big ask from manufacturers at this stage), capture images of it from every conceivable angle and a variety of lighting conditions (seems like you could automate this data capture pretty well with a robotic arm to rotate the object and some kind of lighting rig), get an instance mask for each image (using either human annotator or a 3d reconstruction method or a FG-BG segmentation model), paste those instances on random background images (e.g. from any large image dataset), add distractor objects and other augmentations, and finally train a model on the resulting dataset. Helps that many grocery items are relatively rigid (boxes, bottles, etc). I guess this would only work for e.g. boxes and bottles, which always look the same, you'd need a lot more variety for things like fruit and veg that are non rigid and have a lot of variety in their appearance, and we'd need to take into account changing packaging as well. |
|
the simulated data also becomes an issue of cost. we have to produce a realistic (at least according to the model) digital twin that doesn't interfere too much with real data, and measuring that difference is important when you're measuring the difference between Lay's and Lay's Low Sodium.
i'm not saying it's unsolvable. it's just a difficult problem