|
|
|
|
|
by netvarun
4868 days ago
|
|
Some great advice here on crawling at scale, which has inspired our crawlers a lot : http://news.ycombinator.com/item?id=4367933 Basically it boils down to three things:
1. If the site is slow,crawl slooowly.
2. If you see non-200 http error codes, stop!
3. Obey robots.txt and speed restrictions. |
|