Hacker News new | ask | show | jobs
by etrain 2457 days ago
Assume 100 pages on each onion address (it’s probably power-law but let’s just assume that’s the mean). Latency with Tor is super high. Assume average of 5s to load a single page. This is generous because tail latency will probably dominate mean latency in this setting.

These things can happen in parallel but let’s also assume no more than 32 simultaneous TCP connections per host through a Tor proxy.

So we’re looking at ~75k1005/32 seconds = 14 days to run through all of them. You may not need to distribute this but there are situations (e.g. I want a fresh index daily) where it is warranted.