| > Mere Monte Carlo state exploration is wasteful and doesn't provide much insight. Often we don't have error bars on model outputs to even know if an "improvement" in a metric is significant. The funny thing is, I didn't check the author's name until just now. Ed Dougherty, who people below have derided as a "mere engineer", has been working on these problems forever. I'm honestly surprised he's still active or even alive: he was a graybeard when I heard his talk a decade ago. He is a bona fide systems biologist, one of the oldest ones. At that time, his group was doing gene regulatory network inference on gene expression with ~600 genes. They were using the kind of approach (MC) you mention to infer a small subset of the overall network. The main thing I took away from their results (at the time) is you can get multiple drastically different network topologies all with similar metrics on the objective function. This implies GRN inference was not inferring some kind of underlying reality. It also suggests you cannot accurately infer subnetworks, which in turn suggests cellular networks aren't all that modular. Therefore, really a distinction should be drawn between models that are simply predictive and those that also model the underlying reality, which is even harder. > We rely a lot on complex computer simulations, or complex physics-based models...we want to learn from these models, and we want to reach conclusions from them. Not in molecular biology. There genuinely are no models like that except in very limited subfields like protein folding, and 99% of biologists would see them as mathematical mumbo-jumbo. I see from your bio you're also in engineering research. You would not believe it if I told you how mathematically illiterate the average PhD biologist is. My PhD alma mater added a statistics course for the first time last year, a 2 week summer course. Calculus I is "recommended" for admission. This is not unusual. It isn't seen as needed, because state of the art research is basically all qualitative, with a quantitative veneer of t-tests overlaid on top. So I'm glad to hear other fields at least recognize the problem. Biology hasn't even got that far. |
I take your point about the distinction between models that reproduce behavior ("simply predictive") vs. models of underlying components, and what you can learn from both.
This comes up in fields I work on with machine learning models vs. physics-based models. E.g., ML models that take a field of wind vectors at time t, and predict the wind at time t+1, vs. physical models that implement the flow equations. You can fit parameters of both flavors of models to match observations, but we certainly have more confidence in the robustness of the physics-based models.
About mathematically-challenged biologists - here's a hypothesis. I'll bet that if you started scanning conference abstracts in your domain for "uncertainty quantification," then some more carefully-posed modeling activities would crop up. (As you suggest, probably in the domains where more quantitative work is done.)