Hacker News new | ask | show | jobs
by tedivm 1494 days ago
>If I had to bet, and I knew where the data was coming from, I'd say its probably picking up on the style of imaging, rather than anything anatomical. Not all x-rays have bones in, and not all bones differ reliably to detect race.

This was my guess as well. I've spent a lot of time around radiology and AI (I used to work at a company specializing in it) and we read a lot of the failure cases as well. There was one example where the model picked up on the hospital, and one hospital was for higher risk patients- so it learned to assign all patients from that hospital to the disease category simply because they were at that hospital.

There are a ton of cases like this out there, especially when using public datasets (which in the medical field tend to be very unbalanced datasets due to the difficulties of building a HIPAA compliant public dataset).

1 comments

> one hospital was for higher risk patients- so it learned to assign all patients from that hospital to the disease category simply because they were at that hospital.

That just sounds like poor feature selection/engineering. Garbage in, garbage out.

Yeah there are definitely ways they would have avoided that, but it's just one example of many. The whole point of ML is that it picks up on learned patterns. The problem is that it can be difficult to identify what it is learning from- this paper itself says they do not know what is causing it to make these predictions. As a result it is difficult to validate that the model is doing what people think it is.