Hacker News new | ask | show | jobs
by Jonanin 1749 days ago
This implies that there were a couple of AI systems that actually beat a radiologist, which I take as extremely promising for the field of AI radiology.

Like any domain in applied AI, there will be a lot of approaches that miss the mark, or are simply stepping stones to better approaches. There are thousands and thousands of papers on language modeling, but we only needed one superior approach (GPT) to change the game entirely.

The search through any cutting edge problem space is messy and full of failure, and that's fine. You only need one breakthrough.

9 comments

> This implies that there were a couple of AI systems that actually beat a radiologist,

Without any more details about the error rates, we can't be sure how likely this is due to chance. I would caution making any conclusion about AIs without better understanding the underlying statistics.

FTA:

> Thirty four (94%) of 36 AI systems evaluated in these studies were less accurate than a single radiologist, and all were less accurate than consensus of two or more radiologists.

So yeah, no AI system beat consensus of two radiologists. That's pretty damning.

Depending on how correlated the verifications between the human and AI system are, this could be used as a verification system to determine if consensus needs to happen. I.E. Always run the ML system and only ask for a consensus if the ML system disagrees with the diagnosis. This could still provide a lot of value I would assume.
Not a single AI model is better, but what about the consensus of the 36 AI models? Ensembling different models is a common technique to improve machine learning models, did they test that?
> That's pretty damning.

Indeed. And we all know how quickly radiologists are improving at their job. At this rate the 6% of AI systems that beat one radiologist will be down to 0% in no time.

I'd push back on the 6% of AI systems being better than a radiologist and calling that a success, but you are right in the meta.

It's fair to say that yes, AI systems aren't good enough yet. On the other hand, it's pretty clear some technological approach will outperform a radiologist at pattern recognition at some point in the future - whether that's "AI" or "if statements" or some third option.

It's just a matter of time.

Another interesting subtlety is that there are only a finite amount of radiologists and they’re generally concentrated in wealthy countries/areas.

AI based analysis - whether it’s better than a human radiologist or not - is far more scalable and cost effective. Even if used as a screening mechanism to be escalated to a human radiologist, this approach will be very helpful to much of the world.

> I'd push back on the 6% of AI systems being better than a radiologist and calling that a success,

How much time does it take to train a radiologist to that level of performance?

How much time does it take to clone that ML model?

I just meant that it's not clear from this that the 6% are 'overall better' just that the 94% are 'overall worse.' More data is needed, but it does appear that progress is being made, and I'm excited by that.

After all, no AI beat two radiologists.

It depends on context.

Here, the context appears to be a somewhat arbitrary selection of published algorithms. All they've really determined is that at least 94% are not ready to replace radiologists.

That's pretty much confirmation of the default assumption. If they were, they'd all be trying to get these into hospitals, and they're not.

You could also say: There exist at least some radiologists that can be beaten by an AI system. :D

I guess a good questions could be: Will they be more reliable than radiologists in scenarios other than the ones studied?

And how do we know that this small handful that beat the radiologists didn't just get lucky? You really need to know the sampling distribution of what's being measured here.
It doesn't imply anything of the sort. Until a well powered randomized controlled clinical trial shows an overall mortality benefit from an AI screening program, the field hasn't contributed meaningfully to medicine. I'm not saying it won't happen, but we are almost certainly very far from that goal.
Clinical AI (which is currently regulated as a CAD medical device by the FDA) won't replace radiologists but treated as an additional clinical vendor application integrated into existing software. Similar to speech recognition diction that has been provided by Nuance for decades.
The study mentions that all of them were "less accurate than consensus of two or more radiologists".
anon gets fooled by randomness
Tks for this wellwriten, fact driven comment.