Hacker News new | ask | show | jobs
by Lewton 1029 days ago
they're testing 8 different detection algorithms

> The detection systems were 19.67% more likely to detect adults than children, and 7.52% more likely to detect people with lighter skin tones than people with darker skin tones, according to the study.

while they all had a harder time with adults vs children, that 7.52% is gotten by averaging 2 algorithms that performed abysmally, with 6 that had no statistically significant differences

https://arxiv.org/pdf/2308.02935.pdf table 6

3 comments

The two with significantly worse performance were RetinaNet and YOLOX. I don't really know anything about the field, but it's interesting they're both single stage performant models, while the slower but lower miss-rate RCNN variants are two-stage. It's interesting that the pedestrian-specific models are all worse than the general models at detecting people!

The conclusion is kind of weird: apparently their "findings reveal significant bias in the current pedestrian detectors" despite the bias being almost entirely within the single-pass general object detectors. And where it's statistically significant in the other models, the miss rate is low in both cases, and the effect is reversed! (Dry-weather Cascade-RCNN does better on dark-skin than light-skin, among others.)

I think you misunderstood table 6. All algorithms show significant differences in miss rate for children, two show significant differences based on gender, and four others based on skin color. The four that showed no statistically significant difference between light and dark skin had very high miss rates overall. Of the other four, two are much worse for dark skin, and two are slightly better. Those last two are also best at detecting children, but 28% miss rate is still a bit too high for my taste.
Yeah I missed the two statistically significant algorithms that favor darker skin since they have smaller percentage differences than the ones they didn't mark as statistically significant (but I guess that's because of how it relates to the overall miss rate)

RE: 28% miss rate, I think this is meaningless as it's looking at single images/data points, while self driving cars get a continuous stream of data

Are these pedestrian detection models in use in any widely-deployed commercial self-driving car? Is there a limitation since these are images rather than videos? I would've expected these to be addressed in the "Threats to Validity." There is also no control comparison to humans, beyond the two annotators. Are these detectors significantly worse than humans?

There is telling whether these results are valid or applicable at all, but they purport that there are statistically significant unfairness based on gender and skin color. At best, this feels misleading.