|
|
|
|
|
by portobelln
2824 days ago
|
|
I worked for 2 years on the crawl infrastructure team of a well-known SEO/analytics company that was pulling in over a 120 billion web pages a month. It was definitely one of the most difficult projects I've ever worked on and we did have a team of 6-7 very incredibly smart people -- not "25 experts in distributed systems" though :P This is a very lofty goal and I'm not sure how you are going to tackle it with 3 people, however I'm rooting for you and would love to get my invitation soon. Best of luck. |
|
Can you tell more? That's 45K pages per second, and assuming that each page load on average takes 1 second you already need 45K workers. Are you talking requests or really loading web pages and evaluating them?