Hacker News new | ask | show | jobs
by titzer 1749 days ago
> This implies that there were a couple of AI systems that actually beat a radiologist,

Without any more details about the error rates, we can't be sure how likely this is due to chance. I would caution making any conclusion about AIs without better understanding the underlying statistics.

FTA:

> Thirty four (94%) of 36 AI systems evaluated in these studies were less accurate than a single radiologist, and all were less accurate than consensus of two or more radiologists.

So yeah, no AI system beat consensus of two radiologists. That's pretty damning.

3 comments

Depending on how correlated the verifications between the human and AI system are, this could be used as a verification system to determine if consensus needs to happen. I.E. Always run the ML system and only ask for a consensus if the ML system disagrees with the diagnosis. This could still provide a lot of value I would assume.
Not a single AI model is better, but what about the consensus of the 36 AI models? Ensembling different models is a common technique to improve machine learning models, did they test that?
> That's pretty damning.

Indeed. And we all know how quickly radiologists are improving at their job. At this rate the 6% of AI systems that beat one radiologist will be down to 0% in no time.