Hacker News new | ask | show | jobs
by tech-historian 1493 days ago
The interpretation part hit home: "The results from our study emphasise that the ability of AI deep learning models to predict self-reported race is itself not the issue of importance. However, our finding that AI can accurately predict self-reported race, even from corrupted, cropped, and noised medical images, often when clinical experts cannot, creates an enormous risk for all model deployments in medical imaging."
3 comments

"Predict self-reported race". Not race from DNA. (That's routinely available from 23andMe, and is considered an objective measurement.[1]) They should have collected both. Now they don't know what they've measured.

[1] https://www.nytimes.com/2021/02/16/opinion/23andme-ancestry-...

what's this enormous risk they're talking about? racial bias in x-ray reading? race can be a risk factor in plenty of diseases, why should we actively try to remove this information from medical images?
"This issue creates an enormous risk for all model deployments in medical imaging: if an AI model relies on its ability to detect racial identity to make medical decisions, but in doing so produced race-specific errors, clinical radiologists (who do not typically have access to racial demographic information) would not be able to tell, thereby possibly leading to errors in health-care decision processes."
Without knowing the actual outcome, isn’t there also a possibility of error due to not knowing the race of the individual? They used mammogram images in the study and it is well known that incidence of breast cancer varies by race. Removing that information from the model could result in worse performance.
Well one thing you wouldn’t want to do is take the output of this model and then apply a correction factor for race on top of it, because the model is already taking that into account.
Is that true or would it help as a tie breaker in cases where the confidence was just at or below the threshold?
Well I suppose you only care about a correction factor to a binary model when it breaks a tie. You wouldn't want to apply a tiebreaker correction twice though.
Typically? It's coded in the standard. There's a DICOM tag for it.

https://dicom.innolitics.com/ciods/procedure-log/patient/001...

Unlike the authors of this research paper I am not a trained clinician, so I can't tell you. However I would note that the first exemplary value in the link you gave me is "REMOVED".
It doesn't provide example data, but there's still a spot in the standard for it. The values can differ by modality or manufacturer. Sure, it's not required, but certainly it's very important in some situations. Consider dermoscopy.

If interested, searching for "dicom conformance" should yield lots of docs that probably contain specific values for those things.

FWIW, the standard printed out is multiple linear feet of shelf space. There is a spot for a lot of things.

One common issue is a lot of these kinds of tags rely on optional human input and are inconsistently applied. As opposed to say, modality specific parameters produced by a machine, which are consistent.

DICOM is a great example of design by committee, with the +'ve and -'ves that implies.

I don't understand that part. All modern EHRs have a field for self-reported race, and clinical radiologists do typically have access to that information. (Whether they actually look at it, or whether it's useful when reading images, are separate issues.)
ok, maybe it's an US specific thing, why wouldn't a clinical radiologist have all the information he can gather about his patient including race to help the diagnosis?
Because in the US we are required to pretend that there is no such thing as race and no such thing as gender, and all people are exactly and precisely the same and there can be no differences.
Not to get into a flame war, but I want to present an alternate option to yours.

Because in the US some people have a hard time understanding that all races and genders deserve to be treated equally as humans with the same access to goods and services. Further, that there are disparities in care based on race/ethnicity[1][2] and gender[3][4] because of that racism/sexism present in the systems. This then leads to requiring that race/ethnicity and gender data be scrubbed sometimes to keep people from impacting outcomes based on their own biases.

[1] https://www.americanbar.org/groups/crsj/publications/human_r...

[2] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1924616/

[3] https://www.americashealthrankings.org/learn/reports/2019-se...

[4] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2965695/

It sometimes makes sense to scrub race/ethnicity/gender information from certain types of data, typically when a human is going to be making individual decisions.

For example, not having race data on resumes is generally productive, because that categorization can't provide a meaningful input to the decision associated with an individual person. Even if it were to be the case that there was some correlation between race and skill at whatever job you're interviewing for[1], the size of the effect is almost certainly small, and in the meanwhile you've also controlled for any bias in the person doing the reviewing.

If you're having a machine look at a dataset, and the machine determines that race or ethnicity is a material factor in determining some attribute in that dataset, you're not doing anybody any good by denying that fact and destroying the result.

[1]Let's ignore for the purposes of this discussion, fields (like certain sports) where extreme competition combines with a position heavily dependent upon racially-linked physical characteristics. Though even in this case, there is still a (different, weaker) argument for suppressing race data in "resumes" (yes, I know, ballplayers don't submit resumes to their local NBA franchise)

>> Because in the US we are required to pretend that there is no such thing as race

Then you are not pretending very well. When I lived in the US I was shocked at how often it was an issue. It permeates nearly every aspect of US culture.

The icing on that cake: A government-run interactive map so you can lookup which races live in which neighborhoods. Some versions allow you to zoom in to see little dots representing clusters of black or white residents. https://www.census.gov/library/visualizations/2021/geo/demog...

Actually, the US federal government specifically recommends that healthcare providers record patients' race, ethnicity, assigned sex, and gender identity. Most of those elements are self identified.

https://www.healthit.gov/isa/uscdi-data-class/patient-demogr...

Interesting, this is like the dog learning calculus thing. We may create an AI that could perceive things that we aren't able to, or perceive things differently, because we're "limited" in a way that the AI isn't. We wouldn't be able to even tell this is going on, because we don't have the mental model in place to account for it to understand it. We'd be the dog.
> racial bias in x-ray reading?

no, it implies there is a signal in the dataset that could be something other than clinical. This means that until they can pinpoint the cause, or the thing the AI is detecting, all the other things it predicts are suspect.

ie if the AI thinks the subject is west african, then it might be more inclined to diagnose something related to sickle cell.

Or north western european woman in her mid 60s vs a japanese woman might get widly different bone density readings for the same level of "blob" (most medical imaging is divining the meaning of blobs and smears )

My first thought here is to relate this to the problem of early colour film, which was largely tested and validated with only light skin tones in mind. Once it was put out into the wild, folks with darker skin tones found the product to be total crap. Why? Because there was a glaring OOD (Out of Distribution) problem during testing.

Similarly, if the train/test sets used here - for X-ray based diagnostics - using Machine Learning relies only on specific races, then the performance might be worse for other races, given that there's a new discriminatory variable in play.

The obvious solution here is to reduce bias by ensuring race is part of the dataset used for training and testing. Which, due to PII laws in play, may actually be quite challenging! Fascinating tradeoff imo.

I don't get it either. It's accurate. It would be a problem if it got it wrong, which could, for example, underweight quantitative genetic data and adversely influence differential diagnosis.
AI is driven by the training sets, but the goal is to find the underling issues.

Suppose AI #1 got a higher score on the training data and AI #2 had a more accurate diagnosis. Obviously you want #2 but if there is bias in the training data based on race and the AI has access to race then eventually you overfit into #1.

ML models are great tools, but they're way too much of a black box. What you have here is a model that's predicting something you think it shouldn't have been possible to predict, and you can't simply ask it where that prediction comes from. Absent an explanation for how the model is doing this, you have to consider the possibility that whatever is poisoning that prediction will also poison others.
> ML models are great tools, but they're way too much of a black box.

A human doctor is also a black box, in meat form.

yep, the case for "enormous risk" hasn't been well articulated. It's been repeated a lot, but of all the problems in medical care, this isn't one of the larger ones.
What if it turns out that humans have identifiable biological differences among genetic sub-groups, ethnicities, etc? It would be anarchy in the social sciences.
soon they will want to remove race indicators for photographs and tik tok videos. who knows, maybe its racist to be of a race >.>
I suspect this is a "tank vs sky" problem. The article says that the bright areas of bone are not the most important for predicting race. What if it's some features of different hospitals and x-ray setups?

Also did they release their code and anonymized data? If not, it's impossible to tell if this is a bug.

If I got this result in my work, I would check it 10k times over because it defies belief. Even allowing subtle skeletal differences in different ethnic groups, the differences in this case are not in the bone and at least sometimes not visible to the human eye. Unless there is an undiscovered difference in radio-opacity across ethnicities, the result doesn't make sense.

Replying to my own post because I can't edit it anymore.

Apparently this is a known and persistent affect across a variety of other medical images, tests, and scans. Not just for a "race" but for ethnic groups in general, as well as biological sex. So this might actually just be an "AI hit piece" that otherwise confirms an unpalatable but persistent and strong effect in the literature. The causes seem to be badly understudied, in part due of the obvious need for delicacy and respect around such topics.

This result is tremendously implausible to me, but I am finding quite a few articles documenting similar phenomena across things like retina scans and brain MRIs.

>This result is tremendously implausible to me, but I am finding quite a few articles documenting similar phenomena across things like retina scans and brain MRIs.

As prometheus76 says, perhaps you will one of these days be able to mentally resolve the inherent contradiction in the above sentence.

What is the value of being a smug jerk, especially if you plan to be wrong?

If your prior belief points strongly in one direction, it is completely rational to require strong weight of evidence in order to update it to point to the other direction.

And yes, it's a completely reasonable prior belief for a person who is not already versed in medical imaging literature.

I often find that people who study this literature have bad attitudes like yours. You should be grateful that there are people out there who value intellectual honesty enough to acknowledge when a result is a result and to change their beliefs. Instead I get two different people showing up to insult me.

What you are experiencing is cognitive dissonance. Take your time. It's never fun.
I don't see the value in insulting people about this. I wrote a longer response here: https://news.ycombinator.com/item?id=31421346 but it applies equally well to your post.