Hacker News new | ask | show | jobs
by nonbel 3513 days ago
"I don't see how your explanation could enable anything but a statistical model."

First, I wouldn't call Armitage and Doll a "statistical model", it is more a "rational model" derived from first principle considerations. "Statistical models" are stuff like linear regressions, at least to me.

Second, my explanation is useful in that if it is correct, we would need a certain combination of division and error rates to explain age-specific incidence curves. So, within the context of the model (which is commonly accepted), we can put upper lower bounds on these values from epidemiological data.

See for example my earlier discussion on this site[1]. Even if you disagree with my conclusions (somatic mutation can't do it... something is up and hundreds of billions to trillions of $ have been wasted barking up the somatic mutation tree), or find a mistake, that is still what it can be used for:

[1] https://news.ycombinator.com/item?id=12669110

1 comments

First, from Wikipedia [1]: "The Armitage–Doll model is a statistical model of carcinogenesis [...]"

Second, the paper itself [2] talks about cancer rates and uses experimental data for it (I just glossed over it, so I can't give a decent summary). This is a statistical model.

I think we disagree since you misunderstand what the term statistical model means. Furthermore, you are misusing the term "first principles." First principles really means that you start from a well established theory that describes how something works. From there you predict mathematically, or with a computer simulation, what the reality is. Using experimental data is strictly forbidden.

From Wikipedia [3]: "In physics and other sciences, theoretical work is said to be from first principles, or ab initio, if it starts directly at the level of established science and does not make assumptions such as empirical model and fitting parameters."

I generally have no idea about cancer research, so I have to trust you and cannot comment on the usefulness of the approach.

[1] https://en.wikipedia.org/wiki/Armitage%E2%80%93Doll_multista...

[2] http://www.nature.com/bjc/journal/v91/n12/pdf/6602297a.pdf

[3]https://en.wikipedia.org/wiki/First_principle

1) The original implementation of the Armitage-Doll model contains simplifications made for computational reasons that mess it up. You can easily see this by checking that the probability of a mutation per cell goes over one (see eq 1 in the appendix of Armitage and Doll 1954). Check the link to my earlier discussion on this site for the corrected version. BTW, I also see some discussion of Armitage & Doll in the appendix of the current Alexandrov et al. (2016) paper (they use the simplified version of the model).

2) Every model is originally based on some kind of observation. The Armitage-Doll model is basically "cancer is caused by the accumulation of errors in a single cell", then they go on and do the math from there. Sure, it would be great to know exactly what those errors are, how many it takes, which cells, etc so that we can constrain all the parameters. You are saying that "from first principles" precludes having any parameters, either free or determined by data?

3) As I said, I think the above is quite different from a statistical model like y = a +b*x + eps. Note: In some cases you can deduce an equation like that from an idea like Armitage-Doll, which is fine. Armitage-Doll definitely has more content to it.

>"The original implementation of the Armitage-Doll model contains simplifications made for computational reasons"

Quoting myself rather than editing...Actually, I forgot they come right out and say it:

"This result will be valid for large values of t (of the order of a human lifetime) provided that p1t, p2t, y, prt are all sufficiently small (as could be assumed in an application of this theory to human cancer)." http://www.nature.com/bjc/journal/v91/n12/pdf/6602297a.pdf

But with low probability of mutation at a given site p, how can you get their model to turnover (as is seen in the age specific incidence data)? I don't think you can, however the turnovers easily appear if you allow high mutation rates along with high clearance rates. But those high mutation rates are inconsistent with the estimated mutation rates in human cells.

Anyway, I hope someone checks into it because something is wrong with the mainstream model of carcinogenesis.

Edit:

It is also possible the data used to give the age-specific incidence (ie SEER) is fatally flawed and those turnovers are artefacts.