Hacker News new | ask | show | jobs
by miki123211 22 days ago
There is something to be said for "one way indexes."

Imagine you run a company register for a local government. You want to let people look up companies by their registration number (which they must disclose in all communications to you) to see if they're legit and whether any warnings have been raised against them. You don't want unscrupulous marketers to just be able to `SELECT * FROM companies WHERE type='nail_salon' AND city='london'`.

If you aren't super strict about scraping, some shadowy business in Neverland, completely unconcerned with following your laws, will build that database.

2 comments

> Imagine you run a company register for a local government.

Is this data not public for some reason? I think it will not hurt if there are multiple copies spread between public offices and private companies. What really hurts is a private company hammering your webserver for their own profit. They should get their own copy.

If the purpose of the index is to allow people to lookup registration and warnings, probably just serve the list. This is public information and doesn't need to be gated. CSV header could be:

Reg_no, status, no_warnings_last_12m