| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mrtnmcc 2065 days ago

FiveThirtyEight has a page where you can choose winning states (condition on a certain outcome) and it will regenerate the prediction map, https://projects.fivethirtyeight.com/trump-biden-election-ma... This appears to be the what Andrew Gelman is also trying to do with their raw data.

At the bottom of the 538 page it says, " If you choose enough unlikely outcomes, we’ll eventually wind up with so few simulations remaining that we can’t produce accurate results. When that happens, we go back to our full set of simulations and run a series of regressions to see how your scenario might look if it turned up more often."

I interpret that as running a regression (linear?) and extrapolating it out to the tail where the conditioning is happening. This should eliminate the issue Andrew is seeing?

1 comments

paulgb 2065 days ago

From the plots Andrew posted, it looks like the problem is not just sample size and that (some) individual state pairs have inverse correlations, e.g. https://statmodeling.stat.columbia.edu/wp-content/uploads/20...

link

mrtnmcc 2065 days ago

I'd argue negative correlation on conditionals distributions can be reasonable here.

In that particular WA-MS example, if Trump suddenly took more liberal positions and somehow won WA (e.g., announces he's pro abortion), he would in fact be more at risk of losing Mississippi. The idea that these two states are in play already is fringe and would require some major idealogical (or other third variable) shifts.

link

amscanne 2064 days ago

But then the correlations with the other 48 states are broken. In that insane scenario, Mississippi now votes for Biden (because I guess he’s suddenly come out as pro-life as well), but Alaska still goes for Trump.

The negative correlations don’t make sense. Maybe it’s a small problem and the model is solid overall, but... I don’t think you can justify that one effect.

link

mrtnmcc 2064 days ago

I think you have to look at the joint distribution with Alaska included to draw any conclusions. Just looking at separate marginals will be uninformative.

link