Hacker News new | ask | show | jobs
by screye 2068 days ago
I disagree with the author on the idea that tail is too fat for isolated anomalies. There are most certainly events that can happen, which may lead to a red California or a blue Alabama.

Presidential assassination, war, video proof of something incredibly heinous (pedophilia?), etc. can absolutely lead to these outcomes. You don't even have to go that far back. Nixon and Reagan flipped states like no-one's business.

I do however agree, that 538's state-state correlation model seems weak.

California and Alabama would only flip during a wave, and that wave would consume any and all states. The fact that 538's model doesn't strongly show that pattern is a failing of it. But, it is not clear if a model that inaccurately models the unlikeliest of events (california flipping while Florida stays blue), does not necessarily mean that it is terrible predictor of it's primary target (Presidential likelihoods).

As a data scientist, I can totally understand Nate's hesitation. Do you impose strong priors on the model to reflect strong domain intuition or do build a model that best characterizes the data it is based on. In the presence of infinite data, you should abandon all domain based priors. For single digit data points, priors are essential. For any number of data in between, it is anyone's best guess.

2 comments

I've always liked Enrico Fermi's attitude on this. When you're Enrico Fermi, you get to say things like "One data point gives you a curve. Two data points gives you the distribution about the curve."
Curious is there a source for this? It is meant as satire?
The source is a long ago personal comment from my dad, who worked with and was personal friends with Fermi.

As to whether it's sarcasm, I'd describe it more as an inside joke with just a touch of the self-aware intellectual arrogance Physicists are famous for (see Lord Rutherford's "All science is either Physics or stamp collecting").

Trying to explain a joke is always dangerous so hopefully what follows won't be a mistake, but here goes:

The way I have always interpreted the first sentence is to hear Fermi saying that when you are doing an experiment, you should have a deep understanding of the family of curves or behaviors the system under test is expected to follow, including the full range of curves that would follow from interactions that are wildly different from what you might naively expect. If you really understand the Physics of the system (this is where it starts to blend the line from advice to a joke), the measurement of one single data point should be enough to tell you which of those possible curves describes the actual behavior of the system (the joke being both that it's funny to be that arrogant and that it's obvious to anyone you'd tell this joke to that mathematically you need at least two points to determine a line, so by saying one point gives you not just a line but a curve the speaker is purposely going over the top for fun). Moving into the second sentence ("two points gives you the distribution about the curve") takes the statement into full-on joke mode, with the comment shifting from an observation about knowing your Physics to an insider dig at the relationship between theoretical Physicists and experimental Physicists (Fermi being one of the greatest theoretical Physicists of all time). When he says that two points gives you a distribution about the curve, he's saying he as a theoretical Physicist understands the underlying Physics of the system better than the experimentalist understands the noise in their experimental hardware, or alternatively that the noise in the hardware is sufficiently uninteresting as to be irrelevant to him. The former view would simply be arrogance but leaving the second option open circles the joke back to include a bit of insider self deprecating humor in that he's purposely ignoring experimental error, a thing theoretical Physicists are famous for doing.

Awesome that there's the connection with your dad, I got the joke but it's also funny that some experimentalists have taken this literally AKA hand sketching the gravitational wave signature and looking for matches as done with LIGO.

It's bizarre there was no better analytical/computational way to come to what they were expecting.

OK, so is this your point?

Something could flip California and Alabama (example, Trump starts defending Roe v. Wade and in response Biden somehow manages to sound like he's opposing it). This would probably be some latent hidden variable, like whether the candidates are seen as socially conservative, which would effect all states (though California and Alabama would be the most impacted).