Hacker News new | ask | show | jobs
by moron4hire 2327 days ago
The data does remain free, as long as LinkedIn still provides it for free.

The data without the noise is what you're paying for. The service of winnowing out what you care about from what you don't care about.

Considering how big of an effort it is, and that the source from which it came is still available, why should the cleaned data be free? If I collect fallen trees from public land, chop it into usable firewood, should my bundles of firewood also be free? Or I collect solar power with my own solar cells, should I have to give you the electricity for free?

2 comments

I think this is especially relevant when it comes to things that fall under disclosure & transparency requirements - a lot of information that is legally required to be made available isn't legally required to be convenient. So, as a patient, you may have the absolute right[1] to a free copy of the charge master[2] of a hospital you're admitted to but it could be required that you pick it up in person or that it is only supplied in microfliche form... so a company that's aggregated this and is reselling it can deliver real value.

1. This specific example is BS but plausible - I just wanted something more specific than the vagaries around things like FOIAs or shareholder reports which both have specific facts that can be rendered useless unless you have the context.

2. Basically, list of how much procedures cost.

Absolutely spot-on.

I'm thinking of processed GIS data. If you have ever tried using the various formats that are supplied by government sites, you know what a huge pain it is.

I'm happy to pay a reasonable price for an interpreted and bowdlerized version.

I actually have! I had to import a huge file of all of the culverts around storm drains in a state, and each culvert was multiple pieces of geometry, none of them grouped together in any logical way. It was just a huge list of rectangles that looked like culverts when viewed visually but no way to identify them as being one culvert without heuristics on how close each rectangle was to others. Massively long process that should not have been so.