|
|
|
|
|
by dor_jack_2
3352 days ago
|
|
Loop/spam prevention was done by mixnode, I'm not sure how they do it. The data does not follow a DFS or BFS pattern so pages/site varies greatly by a host's server capacity and anti-crawling configs. There was a minimum of 10 seconds between followup requests to the same website unless robots.txt had a lower delay. Pretty polite... |
|