Hacker News new | ask | show | jobs
by mdoms 2131 days ago
Nate Silver got lucky one time and has been coasting on it since. The man has no credibility.
7 comments

Some of the other models for the last election were really bad. Giving something like a 95-98% chance to Hillary was arguably a fundamental failure. I found it very odd how they arrived at those numbers by e.g. treating every state as a separate chance while mostly ignoring that those results are not uncorrelated.

I think he does a better job at emphasizing the uncertainty while still showing that polls can be pretty reliable.

In a scientific paper, the author would just write down what they do, and trust the reader to be experienced enough to discount such problems themselves. Now, journalists write for a general audience, that is they write for the lowest common denominator, and accordingly the question what they should write there becomes a quite interesting question.

So, 538 did actually change their error estimate during the 2016 campaign to better account for problems of correlation. From a purely mathematical standpoint that is kinda the wrong thing to do, but it is arguably better in line with what the readers expect an error estimate to be.

Why is it wrong?
> Giving something like a 95-98% chance to Hillary was arguably a fundamental failure.

Was it? What's the likely hood that someone who was polling as well as Clinton losing? 15% like The Upshot at the NY Times had? ~7% like with Sam Wang's model? Even 7% is around 1 in 14, not something shockingly improbable.

People often act like Silver's prediction was good because it gave Trump a higher probability of winning than many of the others, but that's not how probability works. If you say there's a 1/6 chance of rolling a die and getting a 2, and I say there's a 5/6 chance of rolling a 2, and we roll and get a 2, that doesn't mean that I'm correct. I don't think we've had enough rolls of elections resembling 2016 to really have a good grasp of where the percentages should be.

In general I question the value of assigning probabilities to election outcomes. This is especially true when you look at the probabilities a few months earlier - for instance, 538 had Clinton going from 49.9% on July 30, 2016 to 88.1% on October 18, 2016. Look at the probabilities they gave during the recent Democratic primaries, and they're also very bouncy. These probabilities lead people to believe that there's a much better understanding of the state of the race than what actually exists, to the point where I'd argue it edges up against pseudoscience.

Election forecasting is mostly about trying to quantify the current state of play based on imperfect signals. There is theoretically a "right answer" well before the final tally, but without being able to look inside people's heads en masse you can only guess at it. Still, this is conceptually different from forecasting the behavior of a system that's actually subject to randomness or [semi-]chaotic instability, where the given uncertainties will correspond at least partly with actual nondeterminism sitting between the current state of the system and the answer.

Therefore I think it's fair to say that the weight an election forecast assigns to the actual winner is a direct indicator of the accuracy of its model. We aren't trying to guess at how a set of dice are weighted, knowing they'll only be thrown once—we're trying to get as close as we can to knowing who is going to vote and who they are going to vote for, and (absent some large disaster or upheaval) a misforecast will be largely attributable to systematic errors in our methodology.

> Giving something like a 95-98% chance to Hillary was arguably a fundamental failure

Why was it a fundamental failure? A 5% chance is one in twenty, it happens.

Im not saying that because of the percentage alone, but because I think the methodology there was suspect. And this was criticized before the election. The lead in the polls was not large, and making it look like a certain thing by treating the states as almost independent results just doesn't make any sense to me.
You think if Hillary went up against Trump, America would choose her 19 times out of 20? I’m not sure that makes the mental calculation any better than 95%.

95% is “Obama vs a dog”. Maybe in another country. Every 20 elections, Obama is bitten by the dog and doesn’t make it.

> 95% is “Obama vs a dog”

That's what it was, don't forget how impossible it seemed at the time, it was a giant upset and shock.

> You think if Hillary went up against Trump, America would choose her 19 times out of 20?

If you are using "Hillary" and "Trump" figuratively about future elections with similarly matched candidates, then yes, see above.

You're forgetting how bad Hillary was as a candidate.

I'm not saying she ran a bad campaign (though she did). I'm talking about her "negatives". Decades of scandals. (Yeah, Trump had them too, but at a minimum it meant that Hillary couldn't use Trump's scandals against him. Also, Hillary's scandals got a lot more national coverage when they happened than Trump's did.) Benghazi. The email server (and with it, the impression that she thought that rules were for other people). The impression that she thought that she was owed the presidency, rather than having to earn it. The way the DNC chose her over Sanders, overruling the will of many of the primary voters. And on and on.

It wasn't obvious at the time, because much of the press was pro-Hillary. But she was a terrible candidate. I think if the Democrats had run anyone else, they probably would have won against Trump.

Is anything different this time around other than the usual incumbent advantages?
It was an upset/shock. It also wasn't inconceivable or "In an unbelievable upset, the Libertarian Party has won."
> That's what it was, don't forget how impossible it seemed at the time, it was a giant upset and shock.

Only in your filter bubble.

Nope. Even Trump himself did not expect to win. That's part of why he was utterly unprepared for a transition, as it has been widely documented.
I'd argue that 538's model likely overestimates uncertainty, because it is a benefit to them both ways.

Either they spin it that they were perfectly correct in those 95% of cases where they get it right, or they spin it that they were the least wrong (and therefore the most correct) in the other 5% of cases where everyone gets it wrong.

At no point did 538 make such a high forecast: their highest forecast was 88%.

There's a joke that you should always express 60% confidence in your predictions, since if the prediction pans out, you can claim to be right, but if it fails, you can bring attention to the "two times out of five wrong" part.

Yes - thanks for explaining this better.
538 has actually written about calibrating past forecasts. https://projects.fivethirtyeight.com/checking-our-work/
Their calibration is good in the context of this article, but there is a reason this article doesn't use their Presidential election model - there isn't enough data to do this calibration here.
Or maybe predicting the out comes of elections are really hard.
So you're saying that 538 wanted an uncertain outcome and hacked the model design and parameters to deliver the outcome they wanted?
I'm not saying this was intentional, but there is a very high incentive for them to deliver this type of outcome.

There is not enough data to back-test their model on a single election which happens every 4 years, so the claim that a 60% prediction is fundamentally very different from a 95% prediction is statistically dubious.

It's not a benefit to them both ways, it's only useful to them if there's a substantial amount of those 5%s in which case they're probably right that everyone else is underestimating uncertainty.
In reality, I'm not sure I believe there's such a thing as a 95-98% probability of one major party winning in a country like the US. One can argue whether it's appropriate to fudge in additional uncertainty but there are a lot of things that could happen in the week before the election (including but not limited to candidates dying) that could throw existing poll results up in the air.

[ADDED: Or maybe something like that really is a few percent probability and you can end up with a 95% probability anyway. It just feels as if there's some upper limit to what you can measure using polls.]

It also discounts the penetration of board of elections in some 30+ states.
"he" does a better job...

who is "he" in your last sentence? ty

> Some of the other models for the last election were really bad. Giving something like a 95-98% chance to Hillary was arguably a fundamental failure.

The page is still up, you don’t have to pull that number from memory. At the end, it was also nowhere near 95-98%. https://projects.fivethirtyeight.com/2016-election-forecast/

By "other models" I think he means models other than 538's, several of which did put the probability of Clinton winning at over 95%.
Why would estimating a 95% chance for Hillary be a fundamental failure? 5% likelihood things happen often. You've never rolled a 20 on a 20-sided die?
Yep the odds of anyone person getting hit by lightning in a year is 1 in 700,000 yet every year several people are struck.
I don't agree, I think he speaks and writes with nuance and intelligence. What is your issue with his Bayesian statistical approach? He went to UofC and LSE so he clearly had top notch training.

His 2016 model and he himself were much more predictive of the trump EC win, he repeatedly stated it was a possible outcome, something the vast majority of other forecasters completely missed.

Nate Silver was merely wrong, as compared to everyone else, who were spectacularly wrong. I'm not sure how that counts as a ringing endorsement.

These polls also never factor in things like "social acceptability of admitting that one voted for an unpopular candidate" or "groupthink among media organizations which aligns to their side of the isle." The entire polling fiasco should be interpreted as the limitations of quantitative data, as opposed to qualitative data.

Our final forecast, issued early Tuesday evening, had Trump with a 29 percent chance of winning the Electoral College.1 By comparison, other models tracked by The New York Times put Trump’s odds at: 15 percent, 8 percent, 2 percent and less than 1 percent. And betting markets put Trump’s chances at just 18 percent at midnight on Tuesday, when Dixville Notch, New Hampshire, cast its votes.

https://fivethirtyeight.com/features/why-fivethirtyeight-gav...

"never factor in things..."

Good luck measuring those.

Consider this on the individual poller level. Every four years you get a new batch of pundit driven ideas about what "really" driving the voters. Suppose you add a question that's meant to magically reveal hidden voter preferences that are hidden by shame. What do the results of that question mean for the bottom line? Well, you're probably going to need multiple election cycle to find out. And by the time you do, the pundit have a new pile of bullshit for you to implement.

Now take it up a level. You've got a bunch of pollsters of caring predictive quality. Let them figure out what works best and include an estimate of their quality in how you process their results. I have no idea is pollster X's new question about cheese preferences is predictive, and honestly neither do they until a couple more cycles pass. All you can do is wight by past performance.

KISS wins in these situations.

> social acceptability of admitting that one voted for an unpopular candidate

Not sure why you are getting downvotes. This phenomenon has been academically researched and documented under the name “preference falsification” with many past examples. If anyone is interested the 1995 book “Private Truths, Public Lies” is on this research.

Pollsters know about preference falsification and try to model it.
No doubt they would try, but their prediction is bound to be at best as good as predicting the preference without asking the preference, which makes it profiling and not polling anymore. And as OP suggests, they failed at this spectacularly.
LSE as in London School of Economics?

IMO LSE is a negative signal of someone's mathematical / statistical skill. I once interviewed someone that was doing a PhD in Math at LSE, and couldn't program or solve simple math / probability questions.

It's quite obvious why - if you want to study math in London, you go to Imperial. LSE isn't probably even a second choice.

I believe they got every single state right in three consecutive elections before 2016? And even in 2016, their predictions align reasonably well with the outcome, easily beating all competitors.

..and that's only presidential elections. See https://projects.fivethirtyeight.com/checking-our-work/ for their quantitative reflections on accuracy. Looks pretty good.

For anyone with a history of probabilistic predictions, the goal is to be well-calibrated. For predictions that have 70% likelihood, you want to be right 70% of the time. 90% likelihood, you want to be right 90% of the time. There's a mathematical way to analyze an entire set of predictions and determine how well-calibrated that set of predictions is. Silver claims 538's predictions are well-calibrated.

There are also tools and websites out there that people can use to make predictions and track their own calibration over time. They're pretty fun in terms of encouraging one's own sense of rationality.

I was very unimpressed with his democratic primary model, which he basically fine tuned every week when new data came in in order to get a result that was closer to his own preconceptions of the race.
> fine tuned every week when new data came in

Yeah, that’s what you’re supposed to do - use new evidence to update your beliefs.

I don't mean he allowed the model to update from new results, I mean the model would update, he'd be unhappy by its results so he'd change it so the outcome reflected his own biases. If you do that you produce a bunch of mathematical rationalizations for probability distributions that you invented from scratch.
Ach, thanks for clarifying.
He’s been consistently more right (less wrong?) than the mainstream media. This hasn’t been a one off thing.
> He’s been consistently more right (less wrong?) than the mainstream media.

Well, once he rose to popularity, 538 has been part of the mainstream media, affiliated with the NYTimes, then owned by ESPN, then transferred within the Disney Empire to ABC News. But, sure, it's pretty consistently been one of the better (not just election) statistical forecasting shops in the mainstream media.

That's the whole point, I think.

We're saying, "he's better at predicting things than the mainstream media!" The issue is that the mainstream media's bar is so low, there is no bar. And the reason this guy is somewhat respected is because his bar is only slightly higher than rat shit, which seems amazing compared to what we're used to.

The guy sucks at actually modelling outcomes. Because everyone does. He just sucks very slightly less than everyone else. Maybe the issue isn't that Nate Silver is slightly better than average, maybe the issue is that we shouldn't be sharing this garbage on the news, since it's not reflective of reality.

It's like saying, "sure, Cuba is a human rights nightmare, but it's okay to trade and interact with them, cause at least it ain't the Congo!"

Asking for people to not make or publish predictions is asking for the sun to not rise. You'll never see that day. Being less wrong is the best option that is actually on the table.
Which “one time” were you thinking of?