| HN Mirror

You're missing the point, it takes me < 5 seconds to clear the adrenals. The example is intended to illustrate that there is no feature to extract that would make a model BETTER than a human for the things that we care about (rare and challenging diagnoses).

No one is arguing that ML can't segment and measure a structure, this is the lowest hanging fruit. ML can't diagnose an adrenocortical carcinoma (an example of a rare disease) because medicine doesn't know how to.

> In other words, we extract a single feature, and apply a single nonlinear activation function to that feature to decide whether or not to activate the 'treat' signal.

Now do this for the > 1000 other possible diagnoses on a CT abdomen, and have it be as fast as a human with equal or better ROC curve in under 5 seconds and cheaper than the $70 a radiologist bills for this. Unless you can eliminate having someone like me read this scan a ML model to measure the adrenal glands is worth $0.

I'm aware of the literature in this space. Your proposal is not novel and has been attempted. As soon as you try doing this on more than a handful of (typically easy) diagnoses it stops working. Currently the only useful models flag normal/abnormal to triage interpretation priority.

> Look at how LLMs are able to do stuff like write code that has comments written in pirate speak.

This anecdote doesn't prove anything but we can instead look at OpenAI's own white paper for their more rigorous data on hallucination and accuracy. LLMs aren't ready for a production CRUD app let alone human life.

> ML models looking at diagnoses with small training sets that are largely in obsolete scan formats could stil

It's not obsolete. It's a completely different image type. This is akin to saying a ML model trained on black and white sketches can paint the Mona Lisa in color.

> all the anatomical knowledge.

A misunderstanding of the problem. The anatomy is easy. The pathology is updated every 1-5 years so there is no historical dataset.