Hacker News new | ask | show | jobs
by QuantumRoar 3516 days ago
First, from Wikipedia [1]: "The Armitage–Doll model is a statistical model of carcinogenesis [...]"

Second, the paper itself [2] talks about cancer rates and uses experimental data for it (I just glossed over it, so I can't give a decent summary). This is a statistical model.

I think we disagree since you misunderstand what the term statistical model means. Furthermore, you are misusing the term "first principles." First principles really means that you start from a well established theory that describes how something works. From there you predict mathematically, or with a computer simulation, what the reality is. Using experimental data is strictly forbidden.

From Wikipedia [3]: "In physics and other sciences, theoretical work is said to be from first principles, or ab initio, if it starts directly at the level of established science and does not make assumptions such as empirical model and fitting parameters."

I generally have no idea about cancer research, so I have to trust you and cannot comment on the usefulness of the approach.

[1] https://en.wikipedia.org/wiki/Armitage%E2%80%93Doll_multista...

[2] http://www.nature.com/bjc/journal/v91/n12/pdf/6602297a.pdf

[3]https://en.wikipedia.org/wiki/First_principle

1 comments

1) The original implementation of the Armitage-Doll model contains simplifications made for computational reasons that mess it up. You can easily see this by checking that the probability of a mutation per cell goes over one (see eq 1 in the appendix of Armitage and Doll 1954). Check the link to my earlier discussion on this site for the corrected version. BTW, I also see some discussion of Armitage & Doll in the appendix of the current Alexandrov et al. (2016) paper (they use the simplified version of the model).

2) Every model is originally based on some kind of observation. The Armitage-Doll model is basically "cancer is caused by the accumulation of errors in a single cell", then they go on and do the math from there. Sure, it would be great to know exactly what those errors are, how many it takes, which cells, etc so that we can constrain all the parameters. You are saying that "from first principles" precludes having any parameters, either free or determined by data?

3) As I said, I think the above is quite different from a statistical model like y = a +b*x + eps. Note: In some cases you can deduce an equation like that from an idea like Armitage-Doll, which is fine. Armitage-Doll definitely has more content to it.

>"The original implementation of the Armitage-Doll model contains simplifications made for computational reasons"

Quoting myself rather than editing...Actually, I forgot they come right out and say it:

"This result will be valid for large values of t (of the order of a human lifetime) provided that p1t, p2t, y, prt are all sufficiently small (as could be assumed in an application of this theory to human cancer)." http://www.nature.com/bjc/journal/v91/n12/pdf/6602297a.pdf

But with low probability of mutation at a given site p, how can you get their model to turnover (as is seen in the age specific incidence data)? I don't think you can, however the turnovers easily appear if you allow high mutation rates along with high clearance rates. But those high mutation rates are inconsistent with the estimated mutation rates in human cells.

Anyway, I hope someone checks into it because something is wrong with the mainstream model of carcinogenesis.

Edit:

It is also possible the data used to give the age-specific incidence (ie SEER) is fatally flawed and those turnovers are artefacts.