|
|
|
|
|
by runbycomment
4066 days ago
|
|
In your opinion, how much of that situation's complexity is eliminated simply by scraping the Google cache of a site? I also wonder how possible it is to hide behind proxies, especially if they are owned by entities in other countries. If a site I'm scraping is unable to identify who does the scraping, it seems difficult for them to prove "this guy uses our data and must be scraping us". |
|
Also, since this is presumably something you're going to be doing as a hobby (money creates trails), the unfortunate reality is that "right" and "wrong" in copyright law matter much less than "Oh crap, I'm being sued for $500k in $further_away(New York|California), how do I defend this?" That's why you don't ignore the polite way of saying "go away" which is robots.txt or the rude way which is a C&D - if a lawsuit (the mean way) is the first communication you have from a company, odds are pretty good that an attorney can help because judges are busy and don't want lawsuits to be the first thing unhappy companies try.