|
|
|
|
|
by netvarun
5021 days ago
|
|
Might be of interest:
I wrote a post a couple of weeks back regarding our distributed crawling architecture built using perl+redis+gearman How We Built Our 60-Node (Almost) Distributed Web Crawler
http://hackerne.ws/item?id=4469911 |
|
FWIW - the value prop of IronWorker is never having to deal with servers again -- and only pay for the seconds you're actually crawling. Fire up a million crawlers (workers) and it's auto distributed across large sets of machines behind the scenes (no "spin up tear down" either).
So in essence, it's a completely hosted version of what you described. The power is seen by "just trying it". No software (except for your code), no installations, no servers, etc.