what's this enormous risk they're talking about? racial bias in x-ray reading? race can be a risk factor in plenty of diseases, why should we actively try to remove this information from medical images?
"This issue creates an enormous risk for all model deployments in medical imaging: if an AI model relies on its ability to detect racial
identity to make medical decisions, but in doing so produced race-specific errors, clinical radiologists (who do not typically have
access to racial demographic information) would not be able to tell, thereby possibly leading to errors in health-care decision processes."
Without knowing the actual outcome, isn’t there also a possibility of error due to not knowing the race of the individual? They used mammogram images in the study and it is well known that incidence of breast cancer varies by race. Removing that information from the model could result in worse performance.
Well one thing you wouldn’t want to do is take the output of this model and then apply a correction factor for race on top of it, because the model is already taking that into account.
Well I suppose you only care about a correction factor to a binary model when it breaks a tie. You wouldn't want to apply a tiebreaker correction twice though.
Unlike the authors of this research paper I am not a trained clinician, so I can't tell you. However I would note that the first exemplary value in the link you gave me is "REMOVED".
It doesn't provide example data, but there's still a spot in the standard for it. The values can differ by modality or manufacturer. Sure, it's not required, but certainly it's very important in some situations. Consider dermoscopy.
If interested, searching for "dicom conformance" should yield lots of docs that probably contain specific values for those things.
FWIW, the standard printed out is multiple linear feet of shelf space. There is a spot for a lot of things.
One common issue is a lot of these kinds of tags rely on optional human input and are inconsistently applied. As opposed to say, modality specific parameters produced by a machine, which are consistent.
DICOM is a great example of design by committee, with the +'ve and -'ves that implies.
I don't understand that part. All modern EHRs have a field for self-reported race, and clinical radiologists do typically have access to that information. (Whether they actually look at it, or whether it's useful when reading images, are separate issues.)
ok, maybe it's an US specific thing, why wouldn't a clinical radiologist have all the information he can gather about his patient including race to help the diagnosis?
Because in the US we are required to pretend that there is no such thing as race and no such thing as gender, and all people are exactly and precisely the same and there can be no differences.
Not to get into a flame war, but I want to present an alternate option to yours.
Because in the US some people have a hard time understanding that all races and genders deserve to be treated equally as humans with the same access to goods and services. Further, that there are disparities in care based on race/ethnicity[1][2] and gender[3][4] because of that racism/sexism present in the systems. This then leads to requiring that race/ethnicity and gender data be scrubbed sometimes to keep people from impacting outcomes based on their own biases.
It sometimes makes sense to scrub race/ethnicity/gender information from certain types of data, typically when a human is going to be making individual decisions.
For example, not having race data on resumes is generally productive, because that categorization can't provide a meaningful input to the decision associated with an individual person. Even if it were to be the case that there was some correlation between race and skill at whatever job you're interviewing for[1], the size of the effect is almost certainly small, and in the meanwhile you've also controlled for any bias in the person doing the reviewing.
If you're having a machine look at a dataset, and the machine determines that race or ethnicity is a material factor in determining some attribute in that dataset, you're not doing anybody any good by denying that fact and destroying the result.
[1]Let's ignore for the purposes of this discussion, fields (like certain sports) where extreme competition combines with a position heavily dependent upon racially-linked physical characteristics. Though even in this case, there is still a (different, weaker) argument for suppressing race data in "resumes" (yes, I know, ballplayers don't submit resumes to their local NBA franchise)
Race is a rough, subjective, culturally-bound summary of characteristics. If you're already evaluating characteristics, adding either your guess of race or a self-reported race is like injecting gossip into good data.
If the outcome that you're trying to predict is also affected by perceptions of race, you've built a gossip feedback loop.
Then you should be looking at ethnicity and not "race" as such. For example, Ashkenazi Jews as an ethnic group are genetically very distinct from other Europeans, but are generally considered "white" on self-reported race surveys.
>If you're having a machine look at a dataset, and the machine determines that race or ethnicity is a material factor in determining some attribute in that dataset...
I think the trickiness is in providing the machine unbiased data to begin with so that it doesn't incorrect associations between features like race. The most egregious examples I'm aware of are the machine learning systems used to suggest criminal sentencing, but, apropos to this topic I believe there are cases where it may produce erroneous associations in something like skin cancer risk.
>> Because in the US we are required to pretend that there is no such thing as race
Then you are not pretending very well. When I lived in the US I was shocked at how often it was an issue. It permeates nearly every aspect of US culture.
The icing on that cake: A government-run interactive map so you can lookup which races live in which neighborhoods. Some versions allow you to zoom in to see little dots representing clusters of black or white residents.
https://www.census.gov/library/visualizations/2021/geo/demog...
Actually, the US federal government specifically recommends that healthcare providers record patients' race, ethnicity, assigned sex, and gender identity. Most of those elements are self identified.
Interesting, this is like the dog learning calculus thing. We may create an AI that could perceive things that we aren't able to, or perceive things differently, because we're "limited" in a way that the AI isn't. We wouldn't be able to even tell this is going on, because we don't have the mental model in place to account for it to understand it. We'd be the dog.
no, it implies there is a signal in the dataset that could be something other than clinical. This means that until they can pinpoint the cause, or the thing the AI is detecting, all the other things it predicts are suspect.
ie if the AI thinks the subject is west african, then it might be more inclined to diagnose something related to sickle cell.
Or north western european woman in her mid 60s vs a japanese woman might get widly different bone density readings for the same level of "blob" (most medical imaging is divining the meaning of blobs and smears )
My first thought here is to relate this to the problem of early colour film, which was largely tested and validated with only light skin tones in mind. Once it was put out into the wild, folks with darker skin tones found the product to be total crap. Why? Because there was a glaring OOD (Out of Distribution) problem during testing.
Similarly, if the train/test sets used here - for X-ray based diagnostics - using Machine Learning relies only on specific races, then the performance might be worse for other races, given that there's a new discriminatory variable in play.
The obvious solution here is to reduce bias by ensuring race is part of the dataset used for training and testing. Which, due to PII laws in play, may actually be quite challenging! Fascinating tradeoff imo.
I don't get it either. It's accurate. It would be a problem if it got it wrong, which could, for example, underweight quantitative genetic data and adversely influence differential diagnosis.
AI is driven by the training sets, but the goal is to find the underling issues.
Suppose AI #1 got a higher score on the training data and AI #2 had a more accurate diagnosis. Obviously you want #2 but if there is bias in the training data based on race and the AI has access to race then eventually you overfit into #1.
ML models are great tools, but they're way too much of a black box. What you have here is a model that's predicting something you think it shouldn't have been possible to predict, and you can't simply ask it where that prediction comes from. Absent an explanation for how the model is doing this, you have to consider the possibility that whatever is poisoning that prediction will also poison others.
yep, the case for "enormous risk" hasn't been well articulated. It's been repeated a lot, but of all the problems in medical care, this isn't one of the larger ones.
What if it turns out that humans have identifiable biological differences among genetic sub-groups, ethnicities, etc? It would be anarchy in the social sciences.