Hacker News new | ask | show | jobs
by naresh_xai 2323 days ago
If you were ever involved in the drug discovery process, then you would know that statistical evidence through clinical trials is only for the last ounce of drug testing. Out of 10,000+ drug molecule candidates only ~ 5 molecules get selected for clinical trials. A lot of molecules are removed as candidates for toxic components/ solubility issues/ chemical stability issues etc.

So even when we do not understand the precise impact of drugs on humans and there is no safer mechanism to test, we leave only 0.5% of candidate molecules to empirical/statistical evidence in the form of clinical trials.

On the other hand if drug discovery was treated as a pure AI problem, we would have thousands of unverified and unsafe molecules in clinical trials.

Causal principles get us to 99.5% of the way in drug discovery. Unfortunately not so in AI.

1 comments

I'm not involved in the drug discovery process, but I can imagine that manufacturers would be interested in helping to use AI to deal with the funnel much more effectively to get to those ~5 molecules much more efficiently.

You're still left with double-blind trials and having to get large sample groups to try those molecules though.

And it's for that reason that drug discovery is always likely to be quite slow, complex and expensive - the efficiency gains will be pushed towards the top of the funnel to make new ideas reasonable to explore, I would imagine.

My point was that when you're not dealing with human physiology and instead dealing with problems that are more tractable through AI - i.e. using regression to tune algorithms through patterns in data - you are going to get quicker and more impactful returns without the same complexity.

And - critically - it's OK to often trust the AI solution you have without understanding causality. If you later find it's doing something odd that is undesirable, you can use that data to help tune the algorithm again without having to understand the causal relationship.

Put another way, you can teach an AI to get better without necessarily understanding the subject completely yourself.

@Paul Robinson: Did you look at the number of mislabeled images in the udacity self driving car dataset? If you do not have an understanding of the subject yourself, you’re likely to make the same annotation errors which feed into a bad model.
Let’s see. Robotics problems were claimed to be tractable through AI. However, a large majority of robotics solutions today are 90% derived through control systems (which follow some degree of causal analysis) followed by AI to optimize the last few percentages if possible.

Finding something odd for an algorithm (especially a deep neural network) is hard because they fail in just so many ways. For example, lenet for mnist almost always gives high confidence predictions for random tensors(torch.randn). Most imagenet models fail in the presence of just 20-30% salt and pepper noise. (Both of these are problems solvable through simple preprocessing techniques)

Not to mention the fact that most models are trained without a background class and tend to give overconfident predictions on out of distribution samples.