Hacker News new | ask | show | jobs
by johnnyg 2053 days ago
The book Thinking in Bets points out that predictions are given with a % of certainty. If I say X party will win at Y% and Z party wins, it doesn't mean the model is broken.
2 comments

On the other hand, this makes the whole operation unfalsifiable in any reasonable time frame. "I always said the candidate had a .1% chance of winning!"

Given two major elections in a row where the results were essentially out of the error bars, it is perfectly reasonable to be dropping your confidence on pollsters very hard. Once was perhaps understandable, but twice is getting into "right .01% of the time" territory.

I hadn't considered the "unfalsifiable" aspect of my comment. I think you make a good point.

At a poker table you get a high number of events to check against but a presidential election is only once every 4 years.

I wonder how accurate the local elections are vs the national? If neither is accurate, then that's a strong case for your point of view and a change in methodology.

> On the other hand, this makes the whole operation unfalsifiable in any reasonable time frame. "I always said the candidate had a .1% chance of winning!"

I wonder how many people still think Nate Silver has a credible perspective: he's spun the above strategy into a job in unscientific punditry.

But elections aren’t single events, they are 50 different events. 5% of the time the results in Michigan will be outside the error bars. But if the results are outside the error bars in Michigan, Iowa, Wisconsin, Minnesota, etc., and the polls all got it wrong in the same direction for each (overestimated Biden’s support) that’s actually statistically quite unlikely.
Both 538 and the Economist forecasting models model errors as correlated - each state's vote share is not considered an independent event.
Wisconsin, Michigan, and Minnesota aren’t statistically independent though, so if a poll is wrong in one of them, it’s also likely to be wrong in the others too.

Correlated errors are tough!

Maybe it’s semantics, but to me that means something is wrong with the model. It’s not just a matter of an unlikely result being happening some percentage of the time. There is some hidden factor that could have been adjusted for ahead of time.
Both, I think.

The model certainly ought to handle correlations, but the evaluation should take them into account too.

You can't treat the election as fifty independent replicates: a miss in both Minnesota and Wisconsin is clearly worse than one error, but it's also not as bad as (say) getting Wisconsin and Rhode Island wrong, where it's more likely that two separate errors occurred.

I understand the point mathematically, I think I'm talking about more of how the presentation of the error term is unintuitive to me. It makes sense to me to say "the predications were right, Trump just got lucky" when he outperforms the 95th percentile error bars in a state. He just got lucky. But if he outperforms it in a bunch of states because the polls didn't account for the fact that Trump supporters don't answer pollsters, then it doesn't make sense to me to say "the predictions were right because they factored in the chance that the polls hard correlated errors." I get that you've quantified the possibility that the polls are wrong in a systematic way, but I don't think that's the kind of possibility people are thinking of when they hear "there is a 5% chance Trump could still win."