Hacker News new | ask | show | jobs
by turtlebits 2530 days ago
The right thing to do would be to reach out to those sites and see if they is they have paid options for getting the data you need.
2 comments

And what happens when they ignore you? I've reached out to tons of website operators to ask for machine readable access to their data on academic, personal and professional projects, I have never gotten a reply and had to resort to scraping.
I can second this for public records websites.

A previous company I worked for aggregated publicly recorded mortgage data. The mortgage data was scraped from municipal sites on a nightly basis because it was not available as a bulk download or purchasable option.

We had requested on several occasions for a service we could pay for in order to get a bulk download of this data, but the municipalities did not have the know how to provide this as were using systems from a private vendor that were prohibitively expensive for them to request modifications. As a result, we worked hand in glove with the municipalities to ensure we were not stressing their infrastructure when we did this scraping, and I think that's the best we were able to do in this case.

Well, when that option is available, as in the case of something like SAMBA WEB MVR, we absolutely opt for that instead, and pay our dues.