|
|
|
|
|
by 0xab
2611 days ago
|
|
I agree, they should do one or the other. The imbalance is totally artificial and objectionable though. Where's the evidence that doctors see a 80/20 split in real life? If there is going to be an imbalance they should make it reflect the actual statistics of the task that the doctors perform not some artificial number. It doesn't even reflect the statistics of the dataset they started with (which is 90/10 unblanaced). Admittedly, the correct analysis for when the data is unbalanced is more annoying and ROC curves are easier to interpret. That's why in something like ImageNet even though the training set is imbalanced, the test set is is balanced. Comparisons against humans are also harder when the data is imbalanced in a way that reflects the training set, not the task. Humans don't know they are supposed to say "no" 80% of the time. That rewards the machine and that isn't easy to correct (you can correct what you think about the machine results with respect to a baseline, but not what biases the humans had). |
|
Cause they definitely don’t. Even in a select subpopulation - say, people going to a derm for screening - you’d expect one melanoma per 620 persons screened (as per the SCREEN trial). Since most people have more than one mole for evaluation, and even those with melanoma will have multiple innocent moles... a mole count >50 triggers a referral for screening, though in more cautious docs, possibly as few as 25...
If you wanna be really generous and consider our hypothetical high risk group to have an average of 10 moles per person, that’s 6209:1, not 80:20.