Hacker News new | ask | show | jobs
by lalaithion 845 days ago
If you see two people roll a d20 and get a 20, you get to say "wow, that was unlikely" to both of them, even if one of them privately admits they were going to quickly re-roll their die if they got below a 10. What matters is their actual behavior (identical in the example) not their intentions. The d6 vs d20 version is different because their behavior is different.
2 comments

Let's imagine that we ran it as a simulation and we ran it a million times. The two people would have a different distribution of results. If you ignore the intention, you ignore reality as if that intention were not a part of it.

Do you not notice that your inference is less accurate using this line of reasoning? Does that not suggest that it's simply wrong?

What do you mean by 'results'?

They would not have different distributions of results on their first die roll.

They would have different distributions of results on their reported die roll.

If I am looking at their first die roll, the fact that they would have different reported die rolls doesn't matter!

Here’s another example:

Say you have a lazy researcher. They flip a coin, and if it comes up heads, they do the experiment. If it comes up tails, they just write down a random number.

If you _only get access_ to the final number, then you should discount what they wrote down – it’s 50% likely to be fake.

If you do 1,000,000 simulations of this, it’s useless 50% of the time.

But if you know the result of the coin flip, it doesn’t matter whether they would have generated a nonsense number in a different timeline, or that they’re not reliably accurate. _You know_ they’re reliably accurate in _this case_, so you can trust their data.

This is well put. Coincidentally in the example the results are the same , but they need not be. given repeated experiments with the same intentions one may expect different distributions.

However, one could just move the argument up a level and manufacture a case of different intentions leading to the same distributions and then ask the same question.

Imagine you have a machine that rolls a d20 and lies if the die comes up 1-19, and tells the truth on a 20. Should you trust this machine usually? No. But if you can _see that the die comes up 20_ then you should trust it. The fact that it sometimes might lie doesn't mean that you should distrust the machine if you can see that in this case it's telling the truth.
> Coincidentally in the example the results are the same , but they need not be.

The questions is whether we should draw different conclusions when the results are the same. I don’t think that anyone has any issues with drawing different conclusions when the results are different!

Unlikely in what probability space? We only see one version of reality so the probabilities that we assign to any outcome are based on a prior choice of probability space. That is why the researchers' intent matters.
Both events have the same probability of happening; 1/20. The fact that the researcher intended to do something in a reality that didn't happen isn't relevabnt.
If you want to know whether a drug is more effective than placebo, the answer to that question depends on both the data collected in a study and the initial study design. There’s a reason why it’s meaningless to say “that was unlikely” after somebody says they were born on January 1, or after getting a two-factor code that is the same number six times. There’s nothing special about those particular events except for the fact that we noticed them. Since we live in a single instance of the universe where they have already happened, they have probability 1. At the same time, on any given instance they have probability 1/365ish or 1/10000. The difference between these two interpretations of the probability is the same difference as having a good experimental design vs a flawed experimental design where you repeat the experiment until you get the results you want to see.
> a flawed experimental design where you repeat the experiment until you get the results you want to see.

But the Bayesian point is that, if you use Bayesian statistics, this doesn't work. Except by outright lying about their experimental protocol or the data that was actually collected (for example, only reporting the successful trial at the end and not all the failed ones the preceded it), an experimenter cannot "fool" you into accepting a hypothesis not justified by the data. They can point to the one successful trial all they want, and make up stories about how the previous failed trials were somehow different, and the Bayesian simply does not care. The Bayesian just looks at the entire corpus of data and finds that it doesn't support the hypothesis, and that's it.

Yes, indeed.