Hacker News new | ask | show | jobs
by laumars 4354 days ago
That doesn't mean that you cannot still write a bot to scrape the content. If it's displayable in a web browser, then it's also parsable from Perl (or whatever language you choose).

Adding delays between page requests and/or distributing the requests would work around a lot of bot detection systems but in worst case scenario, there are projects out there that can solve Captcha's.

Essentially this can be viewed like music of movie piracy; if it can be played then it can be copied.

1 comments

Sure, but throttling is going to make it really slow to gather all the data you want.
If you're stalking someone then you already have enough data to perform a more targeted scrape (ie what town to look up businesses from)