Hacker News new | ask | show | jobs
by klapinat0r 5128 days ago
> They saw the referrals from our sites, then saw that we were displaying showtimes without being licensed to do so (how they knew this I'm not sure)

The referrals are likely the headers you send when scraping. i.e. Referer: <your newspaper>.tld. Depending on whether you actively set the User-Agent header, that might also have contributed to them catching on (be it omitted User-Agent, "urllib2", "<newspaper> Bot 1.0 +<newspaper>.tld; don't sue us", and so forth). If you run a content provider, and try to protect your content/pageviews/API, the lack of either of these headers is also worth looking out for.

1 comments

Good point. I should have mentioned we weren't scraping Fandango itself though (we were using another source). So more so, how did they know that we didnt have a license to display these showtimes. How did they know we were scraping and not just displaying legitimately. Sure they knew we didn't have a license to display their link, but I don't see how that alone would lead them to the other conclusion. If that makes sense...