Hacker News new | ask | show | jobs
by Mister_Snuggles 1955 days ago
This is interesting, but why isn't this data easily available by default? I understand that the Presidential Election is actually a separate election in each state, but why wouldn't each state's election authority make that data readily available by default? This really feels like data that should just be out there for anyone to download and analyze.

In Canada, the equivalent data is readily available from Elections Canada[0]. For Provincial elections, the story is a bit more mixed, but open by default is the general rule. Elections Alberta, for example, provides Excel files with poll-by-poll results[1] - it's not as easy to work with as Elections Canada's CSVs, but a little Python can get it into a more reasonable format.

[0] https://www.elections.ca/content.aspx?section=res&dir=rep/of...

[1] https://officialresults.elections.ab.ca/orResultsPGE.cfm?Eve...

2 comments

The data is easily available in most places. The issue is that every place uses a different reporting format, including using slightly different names for the same candidates. The value of the bounty is getting this data into a single schema recording all the votes for every precinct in the country, with all names normalized.
What level are the differences at? I can see each state having a different reporting format. Do the differences go down to the electoral district or county or something?

In Canada it's nice and easy - federal elections are handled by Elections Canada, provincial elections are handled by each province's election authority, and we don't have cases where an election at one level results in a person at a higher level getting into office. Well, except for Alberta's senate elections which are somewhat farcical anyway (senators are appointed by the Prime Minister, so the results of this particular election are basically vague suggestions that the PM sometimes follows).

New York is a great example of weirdness.

New York allows candidates to be the nominees even if the candidates aren't in that party. So you had Joe Biden as the nominee for the Democratic Party and then Joe Biden also listed as the nominee for the Working Families party.

Just weirdness like that abounds in the data in almost every state.

In addition a lot of the reporting for precincts was county level, so states wouldn't have a csv that contained all precinct level voting data so you have to go to each county to get that data. Some states have a lot of counties. PA for example has 67 and each county publishes data in a different format with different values.

It's tedious and honestly impossible to automate (at least in the case of PA).

That's fascinating! Thank you for the explanation!

Our elections in Canada are a lot simpler - it's interesting to see how our neighbour does things!

Canada. Canada! Canada! — another Canadian :)
> So you had Joe Biden as the nominee for the Democratic Party and then Joe Biden also listed as the nominee for the Working Families party.

I’m confused how this works. The candidate themselves doesn’t have to claim they are from a particular party to be listed as such? That seems wrong or misleading but what is the point? Is it some kind of hack to garner enough votes for a party to trigger some kind of funding?

Parties don't get votes (at least in most elections), the candidates do. The candidates can be members of a party and may be endorsed by them. in NY it sounds like a party can endorse a non-member. This doesn't sound fundamentally wrong or misleading to me.
Differences are down to reporting entity. In the US this is usually the state, but is often the county. I don't think I saw any state with congressional district level differences.
Thanks - this is definitely clearing up some of my misconceptions about how the US votes!
Not only is it not easily available, but there's no current way to validate the vote casting or tabulation process. Biden literally couldn't prove that he won even if he cared to.

Some say this is a bug, but sounds more like a feature. If not, why wouldn't it be fixed when the technology exists? And why would people go over and above to have the current technology installed?