Hacker News new | ask | show | jobs
by t_a_v_i_s 1166 days ago
I think Import.io, Bright Data, and Zyte would fall into that category.
1 comments

These are web scraping services. Data ownership is a grey zone at best, depending on which country you're in. Besides, copyrighted data might be scraped by accident. What I am proposing is much stronger than that, rather like an "audited" dataset that comes with guarantees because its curation can be fully backtraced.
I'm guessing you can pay one of these companies to meet these types of requirements.

I know that highly-regulated financial institutions that purchase web-scraped data have very strict rules about the data they buy.

The US has an organization dedicated to this: https://www.investmentdata.org