|
Reading this it sounds like 'AI' is when you build a heuristic model (which we've had for a while now) but pass some threshold of cost in terms of input data, GPUs, energy, and training. The classical approach was to understand how genes transcribe to mRNA, and how mRNA translates to polypeptides; how those are cleaved by the cell, and fold in 3D space; and how those 3D shapes results in actual biological function. It required real-world measurement, experiment, and modeling in silico using biophysical models. Those are all hard research efforts. And it seems like the mindset now is: we've done enough hard research, let's feed what we know into a model, hope we've chosen the right hyperparameters, and see what we get. Hidden in the weights and biases of the model will be that deeper map of the real world that we have not yet fully grasped through research. But the AI cannot provide a 'why'. Its network of weights and biases are as unintelligible to us as the underlying scientific principles of the real world we gave up trying to understand along the way. When AI produces a result that is surprising, we still have to validate it in the real world, and work backwards through the hard research to understand why we are surprised. If AI is just a tool for a shotgun approach to discovery, that may be fine. However, I fear it is sucking a lot of air out of the room from the classical approaches. When 'AI' produces incorrect, misleading, or underwhelming results? Well, throw more GPUs at it; more tokens; more joules; more parameters. We have blind faith it'll work itself out. But because the AI can never provide a guarantee of correctness, it is only useful to those with the infrastructure to carry out those real-world validations on its output, so it's not really going to create a paradigm shift. It can provide only a marginal improvement at the top of the funnel for existing discovery pipelines. And because AI is very expensive and getting more so, there's a pretty hard cap on how valuable it would be to a drugmaker. I know I'm not the only one worried about a bubble here. |
For decades, CV was focused on trying to 'understand' how to do the task. This meant a lot of hand crafting of low level features that are common in images, finding clever ways to make them invariant to typical 3D transformations. This works well for some tasks, and is still used today in things like robotics, SLAM etc. However - when we then want to add an extra level of complexity - e.g. to try and model an abstract concept like "cat", we hit a bit of a brick wall. This happens to be a task where feeding a large dataset into an (mostly) unconstrained machine learning model does very well.
> The classical approach was to understand how genes transcribe to mRNA, and how mRNA translates to polypeptides; how those are cleaved by the cell, and fold in 3D space; and how those 3D shapes results in actual biological function.
I don't have the expertise to critique this, but it does sound like we're in the extreme 'high complexity' zone to me. Some questions for you:
- how accurate does each stage of this need to get to useful performance? Are you sure there are no brick walls here? How long do you think this approach will take to deliver results?
- do you not have to validate a surprising classical finding in the same way that you would an AI model - i.e. how much does the "why" matter? "the AI can never provide a guarantee of correctness" - is true, but what it was merely extremely accurate, in the same way that many computer vision models are?