Hacker News new | ask | show | jobs
by aakilfernandes 2062 days ago
You're comparing two scenarios, one in which you know all the facts, and one in which you don't.

In the dice toss scenario, we know everything relevant. In the election scenario, we don't.

A model like this is attempting to say "these are the rules we think exist. Based on the rules, and assuming the data is off by some random distribution, here's what we think could happen".

What different forecasters disagree about is what the rules are. For example, the relevance of certain demographic characteristics and the potential variance between polling (conducted prior to the election) and actual election results.

There's a huge amount of assumptions, and forecasters disagree on those assumptions. We have very little historical data (polling is very recent) and even with complete historical data, future elections do not always conform to past elections.

1 comments

I will veer this off into the dreaded political territory even though this is mostly a technical discussion.

The Democratic Party proved it was not as progressive as they thought as Sanders lost the primary. The reality is, the country as a whole is also not as liberal either, regardless of what these pollsters are asking people. You think the party is youthful, and ready for progressive ideas, but alas, the party wholly rejected an amazingly progressive candidate in Sanders. You think everyone’s super pissed at Coronavirus handling, and police brutality, healthcare, but alas, you find out people associate BLM protests with crime, and the virus with China, and socialism with unfair wealth redistribution. We can keep learning this the hard way I guess, this is America after all.

It’s important the technical discussions are happening this time around, because there was virtually none the last time. The post mortems for these forecasts being wrong again should be a death knell for accumulating bad data. I’m certain the models are good, but I’m not certain the data is.

Anyway, if you want my hot take, the conditional forecasting is to save their ass on election night from being embarrassingly wrong again. Imagine writing a giant if-statement that looked something like ‘and if(imWrong) changeMyAnswer’.

> Anyway, if you want my hot take, the conditional forecasting is to save their ass on election night from being embarrassingly wrong again.

Well Nate Silver wrote a full critically acclaimed book about why these types of forecast are more useful (and accurate) in reality because they account for uncertainty - he has been doing this for years, ever since he used to write similar algorithms to help bookies pick odds for sporting events, so I think your hot take isn’t based in any world of facts or knowledge on this.

Don’t trust a forecaster that says with certainty that a certain candidate will win, unless they have also bet their life’s earnings on it. Showing your statistical confidence level isn’t a bad thing.

I think it’s certainly more grounded in reality if you realize 538 is basically finished if they miss the mark again.

If you listen to what they say, they admit they were not able to measure for the no-colllege male demographic in 2016, or in other words, they couldn’t model identity politics. Why couldn’t they do that? I’m not sure, but they are certain they can this time around because they saw the 2016 data and now believe they have more complete data to not make the same mistake again.

They are looking at elections as if there are hundreds of millions of elections that happen every day and the data speaks for itself. No sorry, there’s very few elections to extrapolate the way they are doing it, and you really need to do sociopolitical analysis of things like a demographic identity bloc (no-college whites that feel some way about things) that really get you the accurate undercurrents that can sway an election.

Lastly, it doesn’t take a genius to sit there at 10pm on election night and go ‘well if Florida and Michigan went this way, then probably so will these other states in flux’. ‘Our forecast becomes more accurate as we get the actual poll closing numbers on election night’, ah I see, you’re all geniuses, I should have known.

Anyways, we’ll know soon enough.

> If you listen to what they say, they admit they were not able to measure for the no-colllege male demographic in 2016, or in other words, they couldn’t model identity politics. Why couldn’t they do that? I’m not sure,

You seem to have a fundamental misunderstanding of what FiveThirtyEight is trying to model, versus what pollsters are trying to model with the numbers they publish that FiveThirtyEight consumes. The kind of demographic weighting you're complaining about FiveThirtyEight being bad at is something the pollsters do, and is outside the scope of FiveThirtyEight's forecasting models.

> If you listen to what they say, they admit they were not able to measure for the no-colllege male demographic in 2016, or in other words, they couldn’t model identity politics. Why couldn’t they do that? I’m not sure, but they are certain they can this time around because they saw the 2016 data and now believe they have more complete data to not make the same mistake again.

I think you possibly misunderstand what 538 _do_ a bit. Their data is based on polling, so they can only work on what the pollsters do. Historically, pollsters didn't pay that much attention to education, beyond using income or class as a proxy for it; one middle-class white man was pretty much like another. This worked quite well historically, but no longer does (and it's not just a US phenomenon; it was also a contributor to polling problems for Brexit, notably).

In their current model, 538 assume a higher rate of uncertainty than last time round; also, some pollsters now model education. But really there's not that much they can do about stuff that pollsters don't ask about.

No, I don’t think so. If you build a model out of pollsters asking stupid questions, you deserve some blame.

I’ve got some basketball statistics to populate 538s model if their interested. Lebron did pretty good this season, hopefully they can correlate that with the black vote.

Their model is not transparent on any level, because if they make it transparent, we’d easily be able to see why it’s ridiculous.

> I think it’s certainly more grounded in reality if you realize 538 is basically finished if they miss the mark again.

What does missing the mark mean though? In 2016 they proposed a c30% chance that Donald would win, and a 70% chance Hillary would win. Does that mean they were wrong? Not really, because that's how probabilistic forecasting works - and they stated their confidence interval - they were 70% confident that Hillary would win, but thought there was a 30% chance Donald would win.

> The Democratic Party proved it was not as progressive as they thought as Sanders lost the primary.

The FiveThirtyEight forecast for the Democratic primary [1] gave Biden the highest chance of winning for most of the process. He did have a steep drop in the month before Super Tuesday (followed by an equally steep rebound), but still, I wouldn't say the forecast was especially bad. That said, polling is always worse for primaries than general elections, since there are more candidates and fewer voters.

[1] https://projects.fivethirtyeight.com/2020-primary-forecast/