Hacker News new | ask | show | jobs
by n3ur0n 1987 days ago
I do respect your experience and take on the matter, however, let's replace this statement:

"I'm an eye surgeon and self-taught machine learning practitioner, I started to learn Python in 2016 when the deep learning hype was at his highest."

with:

I'm a [machine learning researcher] and self-taught [ophthalmologist], I started to learn [ophthalmology] in 2016 when the [clinical medicine] hype was at his highest.

In this hypothetical situation, I bet you would instantly discount what I would have to say about ophthalmology because I clearly would not have the depth or experience to have an informed opinion on ophthalmology.

Over the past few years with the ML hype, I have noticed quite a few clinicians who have self taught some deep learning methods claim expertise in the subject area (not targeting you, a general observation). I feel like many clinicians do not understand the breadth of machine learning approaches. There is just so much to know! from robust statistics, non-parametric methods, to kernel methods. Deep learning and deep generative models are by no means the only tools at our disposal.

I absolutely agree with you though. Applied machine learning practitioners have been over selling their accomplishments -- which I believe is detrimental to progress in the field.

I would highly encourage you to collaborate with ML researchers who have spent a decade or more working on hard problems. From the other side, I can tell you I gained a lot discussing ideas with domain experts (neurologists, radiologists, functional neurosurgeons). They have insights that I could never have picked up by self teaching.

3 comments

The troubles we are seeing with medical AI integration are not stemming from lack of personal abilities, though. The problem is clearly systemic, with medical data being currently mostly unusable (for both humans and machines, although humans often believe otherwise). So you can be as good as you want either in medicine or ML or both, material support is lacking for wide applicability of medical AI.
Haha, you are perfectly right. I totally admit that I'm an amateur with a low level of ML expertise.

One the other hand, ML researchers with a deep knowledge expertise are extremely hard to find, even among statisticians / programmers. I suppose that the people with a real expertise are working on their own startup or in FAANG.

This leads to a situation where the medical research involving ML is largely without interest or full of bias. It is easy to spot in the literature.

I think it's partly the incentive structure that is to be blamed. Historically, quantitative PhDs in healthcare(medical physicists, statisticians, comp. genetics) have been underpaid (in my opinion). Now with FAANG and Quant Funds willing to pay $400K+ comp packages to these PhDs, there are far more exit opportunities for these PhDs.

On a positive note, I'm so glad that clinicians are taking interest in ML! As a practicing ophthalmologist, the fact that you were able to self teach is really impressive! I do know that a lot companies are looking for people like you, who have clinical experience. If you are interested you should explore roles/potential collaborations with some of these health research teams in tech.

I have barely any biology/medecine nor Machine Learning knowledge (though some physics, maths, programming), yet I might have to do an internship in the field of ML applied to leukocyte classification, where would you recommend to start ?
Depends on the scope of the project. Would the goal be to come up with a better algorithm for cell classification based on histological images? Or to apply an existing algorithm to a new dataset?

The former would be quite difficult without much background in ML/Computer Vision (you would have to spend some time self-teaching basics of ML/Deep Learning and the pre-reqs for those — Basic Linear Algebra and Probability).

The latter is doable. I would recommend a very hands on approach. Pick some computer vision object classification tutorials and code them up (using a high level library). Make a mind map of the concepts and look them up as and when you’re unclear about a concept. Then move on to replicating some well cited, peer reviewed papers. Often papers will have their code on GitHub. Try and relocate their results on their dataset. After this you would have the basic working knowledge to modify the algorithm slightly for your specific use case.

The data in the database comes from a bidimensional matrix (LMNE) where leucocytes are classified on resistivity on one axis and light absorption (?) on the other. (I wonder how they managed the separation by absorption... indirectly via centrifugation ?) So I guess not really histological ?

Looks like it's a new model, I have no idea if they already have any ML models yet. There's also some database work.

I'm finishing a Masters degree in Computational Physics, so Linear Algebra and Probability shouldn't be an issue. (We also have an Image Processing and Analysis course.) I guess that's why they contacted us despite the fact that we don't have any ML training ?

Yeah, this is basically what I thought to do, but thank you for your advice !

Given your background, I think it would be worthwhile for you to pick up ESL [0] and read some relevant sections (supervised/sparse/linear methods). It's a great book and a good starting point for thinking about ML methods for high dimensional data.

Also, might be useful to took at webpages of some researchers in this space and courses they teach [1,2].

  [0] https://web.stanford.edu/~hastie/ElemStatLearn/  
  [1] https://scholars.duke.edu/person/dunson  
  [2] https://www.cs.princeton.edu/~bee/
Thank you !

Funny (but I guess expected) to see the Markov Chain Monte Carlo method that we very recently learned in that book's table of contents ! (Unless it's another MCMC ?)