Hacker News new | ask | show | jobs
by ganstyles 1960 days ago
My company has used ML to create synthetic cancer data to train classifiers to augment doctors/specialists who are looking for cancer. This work has greatly increased accuracy in diagnosis, saving lives. To say it's only for music generation or generating waifus is a bit unfair.
9 comments

> This work has greatly increased accuracy in diagnosis, saving lives.

As an MD with a special interest in statistics, color me skeptical. I'd love to be proven wrong though, so please provide references.

Edit: yeah, so the way this whole thread is developing really goes to show (yet again) that medical AI hype is relying as strongly as ever on the fantasies of people who've never seen any clinical work.

I know of at least one Canadian hospital that's incorporated ML into 100% of their ED triage. Sure it's not some state of the art deep learning architecture, but it's definitely a step above the old crop of heuristic-based systems you see so often in medical software. "Medical AI" is a stupid term that's been co-opted by more hucksters than legitimate practitioners, so I prefer to talk about more concrete (and less fanciful) applications like patient chart OCR or capacity forecasting.
> I know of at least one Canadian hospital that's incorporated ML into 100% of their ED triage.

Cool! Which hospital is that? Is the clinical staff happy with the results?

Personally, I've never seen any medical ML application that made my job easier. But it would be nice to see.

Maybe not hard data, but EPIC's software (which is used in about ~25% of hospitals for EHR) has over the years been utilizing patient data to be used for treatment recommendation purposes. Again, difficult to weigh the impact if we don't know how many doctors are relying on these types of recommendations and acting on them but it is definitely out there in the real world at the moment. >>https://www.epic.com/software#AI
> we don't know how many doctors are relying on these types of recommendations and acting on them

I can answer that: close to zero. Clinicians don't want stuff that makes recommendations, as good as they may be. They want a bycicle for the mind: something that helps them visualize, understand the big picture and anticipate better. And also ensure that trivial stuff to do is not forgotten (now that's the place a recommender engine could fit in). That's a fundamental misunderstanding of what a clinician's job is that is unfortunately very common.

What do you ask of your software tooling? Do you want something that just tells you what to write? No, you want a flexible debugger. A compiler with precise error messages. You want a profiler with a zillion detailed charts allowing you to understand how everything fits together and why such and such is not the way you anticipated. Same thing for medicine until the day machines will actually do better than humans, which is not tomorrow nor the day after.

Do you have any papers published to support such a strong claim (one that directly contradicts the sentiment of almost every single oncologist and pharmacologist I know that isn't trying to generate profits for a biotech company)?
edit: I posted that I have internal data, but also realized I said a little bit too much about the process. The below point someone is making is a totally fair one. Editing this though for Reasons while trying to keep the part of the comment that led to the below dismissal, and also to clarify my definition of "internal data" to be more expansive than "internal testing on datasets" which is what I realize it might sound like.
That's not how this works. The only way to show an actual reduction in all cause mortality as the result of an intervention, treatment, or screening process is through a randomized controlled clinical trial. If none have been performed, you don't have evidence that lives have been saved. Extraordinary claims require extraordinary evidence.
i think focusing entirely on one type of evidence is a little unimaginative. if these guys have data on doctors performing some task with and without their tool, they're in a good place to measure the difference. they can take that all the way to the bank, and to me that would contribute to what id call evidence.
Until they have to really show that it's working in day-to-day practice. Where it most likely won't, unfortunately.
Totally fair. It is a very, very hard field to make progress in. I would also take anything I say with a grain of salt, I'm not trying to convince you, just to bring an additional data point to the thought that these techniques aren't very useful or impactful.
Maybe the GP threw the wrong buzzword and meant AI instead of ML.
I think ML's currently temporarily useful in fields that have been making decisions mostly based on intuition and heuristics. The medical field's one example, even with some knowledge on biology and anatomy it's hard to diagnose and treat patients only with deductive reasoning, a lot of guesswork and "experience" is involved. In that case ML might be able to perform better than humans, but I think this will have its limits. Above a certain point, I think biological simulation (as in physics simulation) would be a much more useful tool for doctors to understand the human body.
I'm skeptical... But it depends what data sources are available. I was a paramedic so my medical knowledge is limited, but at the same time, we frequently had to do field diagnosis. It's hard to explain... but you can have patients with the same symptoms and two totally different diagnoses. You basically just learn to intuit the difference but none of the stuff we can write down or quantify drives the differential diagnosis. And it's funny because you get pretty good at it. I could just tell when an elderly patient had a UTI even though they had a whole cluster of weird symptoms. Or more importantly, I could tell you when someone was just a psych case despite complaining of symptoms matching some other condition with great accuracy.

It'd be really hard to train a computer when to stop digging because there's nothing find, or when to keep digging because this patient really doesn't feel like a psych case. And the tests and doagnostics aren't without risk and cost.

I've had a greybeard doctor in my personal life that somehow read between the lines and nailed a diagnosis despite my primary symptoms being something else entirely. (I had recurring strep tonsilitis for months and yet he just somehow knew to step back and order a mono test. It came back negative the first time, and he knew to have me tested AGAIN, and lo and behold it was positive.) None of symptoms were really consistent with mono. I tested positive for strep each time and antibiotics would clear it.). Thankfully I happen to be allergic to the first line antibiotic because if you give amoxicillin to someone with mono they'll get a horrible rash all over their body in like 90% of people.

I don't know, if you ever look at a flowchart of biochemical processes, realizing that what we've mapped out is only a tiny sliver of what actually occurs, you'd be more pessimistic about simulation in the near term. We can simulate things all we want but the hard part is rooting the simulation in hard evidence, something which requires massive capital and time investment. Epigenetics complicate even further.
Would you happen to have any links to share further explaining the limits of our knowledge of biochem processes?

How does epigentics complicate this further, is it that it wides the number of inputs into a biochem system

Not exactly what you asked for, but PathBank is a database that quantitatively describes a large part of what we do know: https://pathbank.org/

As far as what we don't know, I'm not sure there's a list. Lack of knowledge implies lack of awareness. I can offer one example: We don't know much about the processes by which collagen fibers are grown and assembled into μm- and mm-scale load-bearing structures in tendon, ligament, bone between embryo and adult, particularly in mammals. Or the extent to which collagen fiber structures are capable of turnover in adults; healing might only be possible by replacement with inferior tissue such as scar.

Personally, I think the complexity of biological systems, and the difficulty of observing their components directly when and where you'd want to, means that they can only be understood with the help of machines. Not necessarily using convolutional neural networks though.

Yeah it would increase the conditionality or contextuality of any given observation because even if the genomes were identical, you have differing levels of gene expression based on environmental stimuli.

So observing that gene X impacts biochemical pathway in some way Y is already really difficult when there are tons of other genes at play. Add on the fact that these genes could be triggered to stop expressing themselves in certain conditions and it makes the whole process of figuring out what is really going on that much more difficult. Even if we can make some observation, there are tons of contextual situations which would potentially invalidate that observation.

That's actually wonderful to hear! Is there some way to assist with that work? Doing something with a human impact is appealing.

(To be clear, my argument wasn't that ML isn't useful -- but rather that individual lone hackers are less likely to be using ML to achieve superman-type powers than I originally thought. Supermen do exist, but they are firmly in the ranks of DeepMind et al, and must pursue projects collectively rather than individually.)

> individual lone hackers are less likely to be using ML to achieve superman-type powers than I originally thought

For a single individual to have "superhuman" impact with ML, they need not only generic ML knowledge, but also specialized knowledge of some domain they want to impact. Actually, because ML has become so generic (just grab a pre-trained model, maybe fine-tune it, and push your data through it) a very shallow understanding of the fundamentals is probably enough, and in-depth domain knowledge much more important.

That doesn't mean generic ML research isn't important, it's just that it has an average impact on everything, not a huge impact in one specific area.

(I suspect many hobbyist ML projects are about generating entertaining content because everyone has experience with entertainment, even ML researchers.)

!00% nailed it. ML/AI are tools. As in any other exercise, tools can make it easier for beginners/amateurs to engage with a project, but they don't replace 10,000 hours of experience and deep domain experience and understanding. It's the master craftsmen and domain experts who will create the most value with these tools, but that may not be in obvious or clearly visible ways.
They also likely need a lot of training data and compute resources
Yes, if you carry some weight in the field then the most useful contribution would be to push for better automated gathering of quality clinical data. This is the most limiting factor currently.

Of course policy activism is far less sexy than building new shiny things, so there's little interest in that.

It has affected a small number of fields for sure. Imaging based diagnosis might be better off due to ML, but imaging based diagnosis isn't going to cure cancer or be helpful for every single disease. Glad it's helping but the authors point is only reinforced. Unless you can make a case that we will cure basically everything with ML that is.
Imaging based diagnosis is not going to cure cancer, but it can guide treatment - based on what the AI reads from the images patients can get drugs that are very effective for their particular cancer. We have very effective drugs nowadays, large part of treating cancer is figuring which drug to give.

Imaging based diagnosis could read presence or absence of particular gene mutations from the images so that the genes can be silenced by the drugs.

Imaging based diagnosis could also figure out whether a particular cancer precursor is going to develop into invasive cancer and do it better than the experts we have now (otherwise we wouldn't use the AI).

This can also be done cheaper than paying consultants to figure it out and it can be done in locations where they don't have the specialists.

Some companies working in the field (some already have tools approved for use on patients):

https://analogintelligence.com/artificial-intelligence-ai-st...

> Imaging based diagnosis could read presence or absence of particular gene mutations from the images so that the genes can be silenced by the drugs.

>Imaging based diagnosis could also figure out whether a particular cancer precursor is going to develop into invasive cancer and do it better than the experts we have now (otherwise we wouldn't use the AI).

Where is the evidence for these claims, other than a VC hype sheet? Like real clinical trials. These claims also show a fundamental misunderstanding of what this data can tell us. Imaging data doesn't give you tumor genetic profiles. It can give you tumor phenotype, which is associated with specific mutations. To get the true genetic profile you need to do deep sequencing at tens of thousands of dollars per tumor, and even then you have the problem of tumor heterogeneity, which lets the cancer evade the treatment.

A major concern I have working in this space is that we're selling people on grand promises of far off possibilities rather than what we can actually deliver right now.

Just the latest iteration of people who don't know biology (used to be Physicists, now it's the AI guys) coming in to save all of us. Once in a while someone does make meaningful contributions, but in the end it's hard to say if the collective investment in attention and money have made it worthwhile or not.
Histology slides absolutely can tell us a lot about molecular changes, see

https://www.nature.com/articles/s41591-019-0462-y

Of course changes in the genotype that impact the phenotype enough to influence the disease also influence the morphology of the cells.

But this is area of active research so you can't expect phase 3 clinical trials. Yet.

EDIT: here is another more "perspective" paper how such tools could be used and integrated in current processes, from the same authors

https://www.nature.com/articles/s41416-020-01122-x

ML diagnosis could actually be worse for us overall, as we might find more harmless cancers and subject people to more unnecessary tests and treatments. Iatrogenic harms are real, especially when ML gives us only diagnostics, and never any treatments.
Regina Barzilay has done some work in this area; I posted slides from a great talk she gave years back a few years ago. The slides seem to be gone and not on Internet Archive sadly...

https://news.ycombinator.com/item?id=20019355

This Twitter thread also has a lot of good stuff.

https://twitter.com/maite_taboada/status/1086415051127308288

Ah, found the PDF - the URL changed at some point before disappearing.

https://web.archive.org/web/20190527041657/http://people.csa...

What did your company to do share those findings with the doctors worldwide? Did your company reduce mortality?

Otherwise it's kind of like, I have invented SkyNet in my garage but I am only using it to become richer through the stock market.

It's admirable that you are working on saving human lives. But are human lives actually saved?

> Did your company reduce mortality?

From the parent:

> This work has greatly increased accuracy in diagnosis, saving lives.

... which is apparently pure speculation
Can you expand on how you create that synthetic data and how the evaluation (increased accuracy in diagnosis) works?
Earlier detection. Is this new tech, or just a sharper hammer?
We don't know yet. What's sure is that it has greatly increased the number of cancer surgeries, especially lungs.

We don't know if that's a good thing yet.