Hacker News new | ask | show | jobs
by enriquto 907 days ago
> When sample size grows, frequentist and bayesian [...] estimates seem to converge to each other anyway

Yes. And so? Bayesians would argue (and I quote) that "the interesting limit in statistics is when the number of samples tends to one. The limit when the number of samples tends to infinity is completely useless."

> I tried getting into Bayesian stats but honestly it just seems overkill for most cases.

There are 3 black balls and 7 white balls in an opaque bag. How likely is it to pick a black ball? Bayesian statistics gives a straightforward answer (you just assume an uninformative prior and perform a computation). But frequentist statistics starts to argue about an infinite number of replicas of your own universe and other nonsensical constructions. Not sure that the Bayesian approach is overkill in that case...

1 comments

> Yes. And so? Bayesians would argue (and I quote) that "the interesting limit in statistics is when the number of samples tends to one. The limit when the number of samples tends to infinity is completely useless."

The "and so?" is answered right after that. The prior dominates, which is a bad thing.

As the amount of data tends to 0 (idk why the quote is using 1), if course your belief tends to whatever your belief was before you saw any data. What else could it possibly tend to? Of course it's very sad that we don't have any data, but that's no fault of Bayesian.
> As the amount of data tends to 0 (idk why the quote is using 1)

The smallest amount of samples you can use is 1, isn't it? If you have 0 samples then you do nothing because you have no data. Is there a way to have half a sample?

> if course your belief tends to whatever your belief was before you saw any data

Your beliefs should tend to that, sure, but if you're trying to produce an actual number for sharing then your beliefs shouldn't be a huge factor, and an uninformative prior being a huge factor is also bad.

For numbers that leave my head/notebook, I'd rather keep the new evidence by itself and say it's weak.

Does Bayesian have a concept for absence of belief? I don't feel like believing anything is equally likely is equivalent to absence of belief. But maybe it is?
There is a concept of minimum knowledge (maximum entropy). There is a concept of invariance (like translation invariance where you have no reason to prefer one position to another because the origin could be anywhere - or scale invariance where the value of a magnitude could be high or low if you don't know anything about the unit of measurement).

I'm not sure if by "absence of belief" you mean "ignorance" or something else.

I think about something like known ignorance. I know that I don't know anything about this thus I refuse to have any belief about what it might be as a I know any belief would be unwarranted.
You need at least something to be ignorant about but for a given "this" you can specify what you do know and calculate a probability distribution representing just that knowledge avoiding any unwarranted belief.

If you have a die and you don't know anything else about it you should assume that the probability for each side is 1/6.

If you also know that the expected value is 4 (instead of 3.5 for a fair die) there is a way to calculate the probability distribution that reflects that constraint - and nothing else.

Now, if you don't even want to think about anything Bayesians can do that too.

Why do you think it's a bad thing for your beliefs to remain the same in the absence of new data?
Should you have any beliefs in the absence of data? And if you have some prior data but no additional data now why carve out past as separate thing and call it prior? Why not just call everything you have - data?
That's precisely how Bayesian inference works! But rather than having to repeat all analysis of prior data sets, we summarize that analysis in the form of a posterior, which becomes the prior for the next analysis.
I didn't say anything about what should happen to your beliefs.
You said that the prior dominating is a bad thing. The prior is your beliefs about a parameter prior to observing data (as I suspect you know!). Maybe I'm not getting what you're saying.
I was assuming these are general purpose statistics, which means you might want to share them with someone. It's bad for those to get tainted by your personal priors. If it's a purely personal calculation then sure that's fine.
That's fine and I would agree - you could share the summary statistic used, or the likelihood ratio between the null and some alternative models.

But you shouldn't share a frequentist parameter estimate or confidence interval if you have prior information that would influence it non-negligibly, at least not without sharing that prior information also.

That's an interesting observation.

Let's say you have a personal belief that something is going to happen with probability x. Would you actually want to tell others that the probability is y, because that's what the data says, without letting people know that for other reasons that are not reflected in the data, you truly believe it is x?