Hacker News new | ask | show | jobs
by stickfigure 2532 days ago
Public records are public.

The fact that some government organizations make it hard to retrieve public records is a flaw in the system. I'd be in favor of a national law requiring all public records to be published in machine-readable form.

In the mean time, it is our civic responsibility to conspire to circumvent these misbehaving public services.

2 comments

If such a national law were passed with funding guaranteed for open publication of records, I would endorse your point of view.

No such funding exists, and municipalities are regularly denied tax increases by their voters for any reason — much less public records publication that would often embarrass and humiliate those same voters.

So in essence you're asking them to cut public services and staffing in order to give hundreds of dollars of IT costs a month to for-profit businesses who can't be bothered to pay some small fraction of their revenue for the costs of delivering those records.

It is our civic responsibility to republish those records for free as citizens. Doing so for profit at the expense of citizens is unethical.

If OP republishes all records received in a freely-downloadable, unrestricted form, then I would happily help them fix their scrapers. They, of course, do not.

Often what the municipalities are doing for public records is harder and more expensive than just publishing an API. So The funding excuse doesn't really cut muster with me.
Can you name a single for-profit public records scraper who republishes the parsed data scraped without charging for data access?

The public records are public. Charging for them is, by the above arguments, immoral. Therefore, not only the municipalities but also the businesses profiting from those public records owe us their scraped data, for free, without regard for profit concerns.

Not one for-profit business does so. Why is their immoral action acceptable, when the same action by a municipality is not?

There's nothing immoral about charging for content that you've aggregated. People sell dictionaries.

The problem here is that instead of building APIs (or just posting to FTP sites), governments are building offices and funding staff to answer snail mail requests. Or building sophisticated web forms and search engines.

It's obvious how we got to this point (before the internet, you obtained public records by walking into an office) but it's long past time to change. We don't need fancy web forms to search and find data; cut all that out and just provide data in machine readable form to anyone who wants it.

Someone will build a pretty commercial interface to public records data. Chances are, they can do it for less than the 8-figure sum required for UI development in the public sector. Win-win.

It is not obvious to me that reducing the cost to consult public data is necessarily a good thing. Just because this data is accessible, it should not amways also be accessible inexpensively. Example given: trial records should be public but it would probably not be nice to have all your judicial record displayed in people's glasses.
Disagree. It's inherently in the public interest to have access to this data as easily as possible. If it's too embarrassing then that's a cultural problem.
Some "public" records are in the gray area as in; should or should they not (black and white) be published. For example salaries, the employer might forbid disclosing salaries, but anyone can just request anyone's salary from the government because its public. But if they could be downloaded from an FTP ...
In the US it is illegal for employers to forbid disclosing salaries.

Discussing salaries is a taboo created by industries to stifle wages.

https://www.monster.com/career-advice/article/truth-about-di...

What government agency allows you to see arbitrary other peoples salaries?
Can you name a single for-profit public records scraper who republishes the parsed data scraped without charging for data access?

Currently? Not off the top of my head. But there was one that scraped municipal records in a large midwest city and made them public for free because they were confusing to get to otherwise.

Unfortunately, the company was bought by a larger company and that portion of what they did was shut down.

Loveland (now apparently called Landgrid). https://landgrid.com/
Loveland is such a cool organization
Public records are published based on certain demand assumptions.

If a real-world demand for, say, some GIS data is hundreds of requests per day, then a crawler that comes in with hundreds requests PER MINUTE will obviously stress the infrastructure. Adjusting infrastructure to cope is not an instant process, nor is it a sure thing to begin with given all the budgeting formalities. So your "civic duty" will ultimately result in destruction of these services, because they simply don't have the means to deal with such thoughtless activism.

You've made an unfounded assumption -- that is, that the person you're responding to is scraping irresponsibly. If they are, as they say, simply replacing human researchers with the equivalent bots, then the net load change from automation is zero, or possibly even negative.