Hacker News new | ask | show | jobs
by portobelln 2824 days ago
I worked for 2 years on the crawl infrastructure team of a well-known SEO/analytics company that was pulling in over a 120 billion web pages a month. It was definitely one of the most difficult projects I've ever worked on and we did have a team of 6-7 very incredibly smart people -- not "25 experts in distributed systems" though :P

This is a very lofty goal and I'm not sure how you are going to tackle it with 3 people, however I'm rooting for you and would love to get my invitation soon. Best of luck.

1 comments

> pulling in over a 120 billion web pages a month

Can you tell more? That's 45K pages per second, and assuming that each page load on average takes 1 second you already need 45K workers. Are you talking requests or really loading web pages and evaluating them?