|
|
|
|
|
by mturmon
2202 days ago
|
|
Thanks for this generous comment. The author of TFA has articulated a genuine problem that is central to many large-scale investigations these days, across many domains. We rely a lot on complex computer simulations, or complex physics-based models, that have a lot of fiddly details that are understood by only a limited set of people. Yet, we want to learn from these models, and we want to reach conclusions from them. This has turned into a key problem for the scientific enterprise. There are so many linked issues, some technical, some philosophical: Mere Monte Carlo state exploration is wasteful and doesn't provide much insight. Often we don't have error bars on model outputs to even know if an "improvement" in a metric is significant. There can be unknown unknowns that keep us from trusting our models completely. It's a very rich and challenging problem space. In my understanding, the Dept. of Energy was the first community to engage with these problems due to the test ban treaty. They had the mandate to ensure the nuclear stockpile works, despite not being able to fully test it. So they need models and they need to know how far to trust them. One landmark reference for that is the NAS report on uncertainty quantification and complex models: https://www.nap.edu/catalog/13395/assessing-the-reliability-... |
|
The funny thing is, I didn't check the author's name until just now. Ed Dougherty, who people below have derided as a "mere engineer", has been working on these problems forever. I'm honestly surprised he's still active or even alive: he was a graybeard when I heard his talk a decade ago. He is a bona fide systems biologist, one of the oldest ones.
At that time, his group was doing gene regulatory network inference on gene expression with ~600 genes. They were using the kind of approach (MC) you mention to infer a small subset of the overall network.
The main thing I took away from their results (at the time) is you can get multiple drastically different network topologies all with similar metrics on the objective function. This implies GRN inference was not inferring some kind of underlying reality. It also suggests you cannot accurately infer subnetworks, which in turn suggests cellular networks aren't all that modular.
Therefore, really a distinction should be drawn between models that are simply predictive and those that also model the underlying reality, which is even harder.
> We rely a lot on complex computer simulations, or complex physics-based models...we want to learn from these models, and we want to reach conclusions from them.
Not in molecular biology. There genuinely are no models like that except in very limited subfields like protein folding, and 99% of biologists would see them as mathematical mumbo-jumbo.
I see from your bio you're also in engineering research. You would not believe it if I told you how mathematically illiterate the average PhD biologist is. My PhD alma mater added a statistics course for the first time last year, a 2 week summer course. Calculus I is "recommended" for admission. This is not unusual.
It isn't seen as needed, because state of the art research is basically all qualitative, with a quantitative veneer of t-tests overlaid on top. So I'm glad to hear other fields at least recognize the problem. Biology hasn't even got that far.