Hacker News new | ask | show | jobs
by arafa 2565 days ago
As someone who uses statistics all the time at work, I sympathize so much with this article and greatly enjoyed it. Every time I try to introduce a Bayesian prior, coworkers either look at me like I'm crazy (because they've never heard of or used Bayesian stats) or like I've suddenly gone soft and introduced a bunch of nebulous, touchy-feely context into the objective truth (if they're dedicated frequentists).

Then we promptly switch back to p-values of .05, a lot of the time not even bothering with a statistical power calculation. I've had better success with introducing power, though. I suspect that's because we can fit it into the existing frequentist framework.

1 comments

> like I've suddenly gone soft and introduced a bunch of nebulous, touchy-feely context into the objective truth

This drives me nuts. If you haven't, check out the paper "Beyond subjective and objective in statistics" by Gelman and Hennig (2017).

Right at the beginning they make the point that any analysis includes external information in many ways, such as adjusting variables for imbalance, how we deal with outliers, regularization, etc.

Especially if you're doing any sort of causal inference, you're usually making strong assumptions before estimating your model, even just in terms of which variables are included and how they're connected. The idea that priors are somehow ruining an "objective" model is just absurd to me. You're already making so many other decisions about your model that will affect estimates and your interpretation of them. Priors seem like another perfectly reasonable decision to have to make as well, with the benefit of getting results that I think in general are must more easily understood by a lay audience. (E.g., I don't think I've ever encountered someone not on my data science team that actually understands what a p-value is. But people are much better at understanding when I say, there's an X percent chance that there is a positive effect here.)

This critique might come from the idea that having a good analytic model, or at least some valuable analytic insights, involves much more than assigning some priors. Of course, the two things don't exclude each other, but for some frequentists Bayesians have the wrong perspective - or at least that's the critique, whether it's true or not.

Another issue that I personally have with Bayesianism is that I believe that assigning probabilities to singular events is only meaningful and admissible at all if there is a good analytic explanation for the respective propensity. For example, we may be able to deduce that a die is reasonably fair from the way it is constructed and our knowledge of physics, and later confirm this by frequentist analysis. Merely believing or claiming that the die is fair is not acceptable. Again, the difference is only one of attitude in the end, I suppose.

Maybe philosophers have given Bayesian statistics a bad rap, too, because many of those who call themselves Bayesians are also "probabilists", i.e., they think that rational belief must conform to the probability calculus. There are many arguments against probabilism and the only arguments that speak for it are Dutch book arguments. The view does not have very strong foundations.

> assigning probabilities to singular events is only meaningful and admissible at all if there is a good analytic explanation for the respective propensity.

Wait a minute, you are making a type error here: probabilities are not propensities. They're degrees of belief. (And even if you disagree in general, this is a Bayesian context you're talking about.)

If I put a die on a table and hide it with a cup, you could still estimate your probability distribution about which face is up. My probability distribution would obviously be very different, since I put the die in there myself. (Replace "probability" by "betting ratio" or "degrees of belief" if it makes more sense to you.)

> The [probabilism] view does not have very strong foundations.

Read the first 2 chapters of Probability Theory: the Logic of Science, by E. T. Jaynes: "Plausible reasoning" and "The quantitative rules". It's very accessible, and you shall see how strong the foundations really are.

http://www.med.mcgill.ca/epidemiology/hanley/bios601/Gaussia...

No, I was not speaking from a Bayesian perspective, I was laying out the propensity-theoretic explanation of probability. The propensity explanation is one of attempts of explaining why singular events might be said to give rise to probabilities, living besides frequentism and Bayesianism. Another perspective worth mentioning is the logical approach, which is in the end purely combinatorial.

Some people think that you need to explain why a die can be fair, rather than just assuming it or only looking at it from a frequentist perspective. Of course, die-hard Bayesians don't think so, but that would be begging the question in the context of discussing criticisms of Bayesianism.

> Read the first 2 chapters of Probability Theory: the Logic of Science, by E. T. Jaynes: "Plausible reasoning" and "The quantitative rules". It's very accessible, and you shall see how strong the foundations really are.

I'm an expert on this topic. The only arguments for probabilism are Dutch book arguments, and there is a large number of arguments against these. See for example various articles by Hajek. Alternative representations of graded belief are, among others:

- plausibility theory (Halpern at al.)

- possibility theory (Dubois & Prade)

- Haas-Spohn ranking theory and variants thereof

- various notions of epistemic entrenchment

- Dempster-Shafer belief theory

- almost any quantitative or qualitative representation of belief in belief revision theory not covered by one of the above theories (e.g. belief update by Katsuno & Mendelsohn)

- by a general logical connection, nonmonotonic logics and AAFs can generally represent notions of belief update, such that the underlying qualitative ordering of states is a representation of graded belief

What you probably mean is that the above generalizations (or qualitative theories, in some cases) could be simulated with probabilities, e.g. by using convex sets of probabilities or what Josang is doing in his "subjective logic". That's true, but then we're no longer talking about probabilism in the sense I've used the word.

Of course, you can also try arguing for probabilism like Savage did: Lay out a set of postulates for your subjective plausibility that happen to allow you to proof that this notion of subjective plausibility is in the end probability. Despite the merits of such work, it is in the end a form of cheating (or "reverse engineering"), because you could just as well come up with plausible postulates that yield the weaker axioms of possibility theory.

> No, I was not speaking from a Bayesian perspective, I was laying out the propensity-theoretic explanation of probability.

Unless you can explain this "propensity" in terms of actual physical properties, propensity by itself is… unjustified. The only domain I know of so far where we could possibly argue propensities are a thing is quantum mechanics. And even then it seems to rest on an anthropic argument: which universe am I living in?

> Some people think that you need to explain why a die can be fair,

A die by itself is not fair, right? A die might be balanced, and the way it is thrown it might have enough unpredictable variability to cause everyone in the room to think "uniform distribution over [1..6]".

Likewise, a cryptographic pseudo random generator is unpredictable (and thus "fair"), to anyone who doesn't know its internal state. Even though the process itself is deterministic, it's just not computationally feasible to guess its output just from the observation of past inputs. (Though for this one I'm relying on the fact we're not logically omniscient.)

> I'm an expert on this topic.

Good. Then you know that any inference strategy that falls prey to Dutch Books is not rational. Right?

To be fair, probability theory is not computationally tractable. I did not verify, but I guess any feasible approximation is vulnerable to some more or less subtle Dutch Books.

Now the way you talk about Dutch Books sound like all the other strategies you mention are vulnerable, not just in practice, but in theory as well. They are thus not perfectly rational. Do their authors at least have the grace to admit this is a flaw that should be corrected?

But then I suspect that correcting the flaw inevitably leads to probability theory itself: if you accept Jaynes three "desiderata" as required for any kind of rational reasoning, as he shows, the result is necessarily equivalent to probability theory as we know it (where probabilities are subjective assessments of plausibility, otherwise known as "degrees of belief").

I can only conclude that you do not accept Jayne's desiderata as necessary for correct inference. And this is the point where I look at you like you're not quite sane.

For reference, Jaynes Desiderata:

  (1) Degrees of plausibility are represented by real
      numbers. (And a continuity assumption.)

  (2) Qualitative correspondence with common sense.
      (explained in more detailed in the book)

  (3a) If a conclusion can be reasoned out in more than
       one way, then every possible way must lead to the
       same result.

  (3b) The robot always takes into account all of the
       evidence it has relevant to a question. It does
       not arbitrarily ignore some of the information,
       basing  its conclusions only on what remains. In
       other words, the robot is completely non
       ideological.

  (3c) The robot always represents equivalent states of
       knowledge by equivalent plausibility assignments.
       That is, if in two problems the robot’s state of
       knowledge is the same (except perhaps for the
       labeling of the propositions), then it must assign
       the same plausibilities in both.
Good luck convincing me (and I suspect, the majority of people, including frequentist statisticians), that we should reject any of these desiderata.

I don't care it's reverse engineering, those desiderata match the way I think. I accept the conclusion that probability theory is the correct (albeit intractable) way to think, because I ultimately agree with the postulates it rests on. Vehemently so. They're not just true, they're obvious.

If you don't accept them, then I can only give up, and remember what Yudkowsky once wrote: "How do you argue a rock into becoming a mind?"

> Good. Then you know that any inference strategy that falls prey to Dutch Books is not rational. Right?

Do you even have an idea what "rational" means? There are people who argue that having cyclic preferences is not only rational, but even sometimes the only rational representation of evaluations. I'm not one of these, but just wanted to mention that things are not as simple as you lay them out.

If by "rational" you mean "fine for decision making", then I need to disappoint you. Dutch Books are not a working criterion for that. It is perfectly possible to make rational decisions with cyclic preferences. Your preferences need to weakly eligible and weak eligibility needs to be top-transitive (Hansson).

Weak eligibility: There are one or more alternatives such that there is no preferred alternative to them.

Top transitivity of weak eligibility: If a is weakly eligible and a~b, then b is also weakly eligible.

These are conditions on preferences. You can have similar conditions on subjective plausibility, of course, once you combine preferences and subjective plausibilities.

By the way, Expected Utility falls prey to Dutch Books. There is a money pump against every risk-averse or risk-seeking agent. Check out Wakker's book, which is much better than Jayne: Prospect Theory for risk and ambiguity. Anyway, EU is often considered rational and widely used, but according to your criterion it would be irrational. (In finance, the kind of Dutch Books are called "arbitrage" and exploited immediately, so the market prunes them away, but in other areas EU is used extensively. Are you maybe a finance guy???)

> For reference, Jaynes Desiderata:

Of course you can just claim "here is my list of postulates, and that's what 'rational' means", but that's not really an argument. The other theories I am talking about are also axiomatized. Take for example Fishburn's seminal work. According to your theory, Fishburn spent most of his life and efforts in decision making on irrational theories. I'm not convinced and rather be willing to talk about different kinds of rationality, if I'd be pressed to make a decision on that.

> (1) Degrees of plausibility are represented by real numbers. (And a continuity assumption.)

There is a vast array of literature on qualitative decision making for which this assumption does not hold. Lexicographic decision making does also not fulfill that requirement and there is a whole French-Belgium school on that, including axiomatizations and practical methods (tools like ELECTRE). For lexicographic decision making usually hyperreal numbers are used.

Qualitative decision making comes with a host of problems and limitations due to Arrow's Theorem, but lexicographic models can be very reasonable and even required if some of the authors in the field are right about some examples of seemingly irrational preferences. In any case, just to say that these axiomatized theories are irrational because "here are my axioms" is unacceptable. I'm sure not even Jayne does that.

As for the continuity assumption: There is a whole field of measurement theory that would tell you when you need it and when you don't need it, and I really don't see any non-measurement-theoretic way of defending such technical assumptions as rationality postulates independently. Again, just assuming these kind of things a bit too simple. After all, I can take any postulate and call it "rational", that's not a meaningful discussion of rationality, though.

> (3b) The robot always takes into account all of the evidence it has relevant to a question. It does not arbitrarily ignore some of the information, basing its conclusions only on what remains. In other words, the robot is completely non ideological.

This is an interesting principle, because even in probabilistic settings it completely controversial how to deal with conflicting evidence and how and when to revise beliefs in the face of evidence that directly conflicts with your existing beliefs.

It's a very vexing and complicated problem with many different solutions. It is definitely an underdetermined problem. One of the best discussions of it has evolved from criticisms of the corresponding update rule in the Dempster-Shafer theory of evidence, so it's worth taking a look at if you're really interested in this topic. But you seem to be hell-bent on taking Jayne's book as some sort of bible, which is weird. It's not as if any of the other approaches I've mentioned in my previous post are unknown or have been proposed by outsiders - it's almost impossible to not stumble across possibility theory (Dubois & Prade) or Halpern's work if you're doing AI research, for example.

> They're not just true, they're obvious.

Maybe for people who do not know the literature very well, but certainly not to me. Sorry. :(

My understanding of physics is that no die toss can be considered "fair" because such a macroscopic system behaves deterministically according to Newton's laws, and isn't even too chaotic to model accurately. No matter the shape or balance of the die, the outcome is determined by the initial conditions and the toss. A skilled gambler can make a fair die land however they want.

The only thing I know is that a well-made die is symmetrical, and so if I have no prior knowledge of its initial orientation then I have to use a uniform prior because nothing else has the requisite symmetry group.

The same could be said for a die that is just sitting on the table without having been observed by me yet, no toss needed.

> A skilled gambler can make a fair die land however they want.

No, they can't. Dice control is a myth, and there isn't a single study that backs it up.

> The idea that priors are somehow ruining an "objective" model is just absurd to me.

I think some caution can be justified to a certain extent (not the blind "emotional" objections). When establishing priors in a low data regime, one must necessarily be careful. It's a knob whose mass can change a lot in the inference conclusion. That said, if we trust our belief about the region the available data do not inform us well of, why not utilize our domain knowledge/belief?