Hacker News new | ask | show | jobs
by scarmig 4983 days ago
Here's a slightly OT question, for politically inclined folks:

How likely would you be to look at a site that on election day scrapes election results as they come in from various swing states, and extrapolates the state result by projecting each county individually from already reported precincts?

Obvious some assumptions about precinct homogeneity and turnout implicit in that, but it'd provide at least some value over the raw results ("OMG with 5% of the vote in, Obama's leading by 20 points in Ohio!")

4 comments

I'm almost enough of a political nerd and devloper nerd to try this, but I think the hardest part would be the scraping... Since the news sites have probably changed their formatting since the last election, it would be hard to test the results until they start displaying. If it take an hour to get all the bugs worked out in the scraper, well by then, the scraping might be usless. Maybe there's some kind of centralized API for accessing raw election results, though.
I'll bet you could get each state's official SOS results relatively easily. However, "calling" the election depends on more than just the official SOS reports. CNN will call things when their exit polls show one thing and then the early rounds of official results confirm them.
I think you'd need more granularity than SOS websites provide. They tend to be county by county, and you really need precinct by precinct.
I think county by county results would be enough to do a reasonable prediction but I don't think SOS websites are updated with those results in anything close to real time on election night.
I considered building such a site about a couple of months ago, the problem is getting the streaming results. From my research there is no public api that allows you access to results in real time at the level of granularity you would need (atleast county level). I suppose you could scrape the data off from the sites of the networks but you don't know how that's going to be formatted until election day at which point it's too late.
CNN and a few others have been doing this in recent cycles, though maybe their presentation could be improved upon. They do seem to emphasize the raw numbers on-air though.

On this topic, a mirror of the Sec.-of-State result pages might be pretty useful. A lot of them aren't going to be able to handle the load.

I think you basically described FiveThirtyEight
FiveThirtyEight definitely doesn't do this, or hasn't in the past, it least. It models via demographics and polling, while this would be a much simpler model basing its predictions on already reported results.