|
|
|
|
|
by henrik1409
3905 days ago
|
|
It's great the see discussions going on here - would like to tie a few comments to the questions of ethical aspects of web scraping: As some has pointed out scraping is not exactly a new thing and a lot of the biggest sites out there are built on the basis of web scraping or crawling. We provide a tool and expect you use that tool while abiding the law - and if not we will of course shut your account down immediately. Breaking the law includes violating copyrights and performing DDoS attacks (Although they will be rather small attacks since even 50 concurrent agents is no big deal for most websites). We consider ourselves good netizens. We wish nothing more than to provide a good, easily accessible and safe tool for extracting valuable information from the internet, be it for a price comparison site in a market that lacks transparency, business intelligence for your company to make informed and wiser decisions, or a PhD project that requires access to millions of data points available online in unstructured form. Additionally if you feel we're providing services that has ill-intent - we are not providing any services (Captcha and proxy rotation) that anyone with a bit of programming skill can not easily use in their own software. The main difference is that we are actively improving and focusing not only on making a good experience for our users - but also on minimizing the impact on the sites being scraped. This involves several things like automated throttling and slow-site detection, request caching, and blocking requests to services such as google analytics - to not interfere with site owners stats. |
|
Thank you.
EDIT: found that in your FAQ:
"Since disclosing IP’s and user agents would allow anyone to identify all traffic coming from our system – we naturally never do."
That is the opposite of being a good netizen and I hope I'll be able to sue you once I find out your services are helping to scrape my content.
2nd EDIT: Found out that you reside in Denmark and therefore in the EU, that makes it way easier then.