Hacker News new | ask | show | jobs
by cshimmin 1532 days ago
As another ATLAS physicist, I can say that this is an excellent article from Prof. Schott. He is very politely arguing that "someone messed up". I'm not sure I agree so much with the point of combining the LEP experiments, which do have some tension with each other. Unless the combination is specifically taking into account correlations between uncertainties at the different experiments on the same collider (which exist, but it's really hard to handle).

Another take many people in the field are expressing is that it's simply infeasible to reliably interpret statistical models at that level (especially one that is dominated by systematic uncertainty), since they are based on approximations and assumptions e.g. that certain nuisance parameters are "nicely" distributed and uncorrelated. See e.g. comments from Prof. Cranmer [1] who is one of the folks who developed the standard statistical formalism and methods used in modern particle physics experiments.

[1] https://twitter.com/kylecranmer/status/1512222463094140937?s...

1 comments

Why don't people use nonparametric methods to get around the problem of assuming certain parameters are "nicely" distributed? (Not a physicist, but curious – this seems like the "obvious" solution.)
Nonparametric methods are often used when the assumptions of parametric tests don't hold.

In physics experiments they want to fix the structure of the model and know the assumptions. They want to know the distribution and parameters to hold. If assumptions don't hold, they must find out why, find better assumptions and fix the model.

To say it differently: physicists are not trying to discover statistical laws. They are trying to discover physical laws trough statistics.

It sounds like they know certain assumptions regarding parameters that are not of interest are wrong. So why explicitly model those, if we don't care about their distribution? We (apparently) only care about an accurate estimate of W boson mass.
That works if the thing is something you can remove from the experiment and model separately, then put it back. In CERN many variables are tied to this one huge machine that is one of its kind.
I admit I'm not familiar with the model used to aggregate the boson data. But there's an entire community of nonparametric/semiparametric statisticians that works on problems just like this. It seems crazy to me that that millions of dollars are spent to build the machines to collect this data, yet the papers are written using statistical models with distributional/independence assumptions known to be false. (The tweet linked above seems to be saying something similar.)

Is there a concrete reason we can't be naive and just bootstrap confidence intervals for example? Of course I defer to the physicists here – but I'm curious whether there's some simple high-level reason the usual tricks don't work.

Don't worry. High energy physics has been at the bleeding edge of statistical methods forever.
Because it’s all interrelated and too many variables make it impossible to nail down anything with certainty if you don’t assume some invariants somewhere?