Hacker News new | ask | show | jobs
by Grimm1 2069 days ago
Only 2 companies in the US have an independent English based index with the contents of the entire web. Granted, the sheer volume of data is a barrier to making the index but removing that, only 4 US companies have crawled the entire internet. I'm going to have to disagree with you on that one. To write a crawler capable of the scale and timeliness to crawl the entire web in a week or two requires some pretty solid engineering. I don't however disagree that building a good search is also difficult.
1 comments

Google and Microsoft were not the first web spiders.

A famous example, Yahoo!, didn't walk away from search and partner with Microsoft because of the difficulty of building an index. They did it because it was going to cost billions per year to try to keep up with Google in producing results.

I'm not arguing there's no work to do in building an index, but the problems of crawling and indexing can be solved by cash. They're a moat against small challengers, but not against well capitalized ones. Ranking and filtering require lots of research and tuning. This is the moat against even the well capitalized.

Put another way: do you really disagree that Google would still easily dominate search based on result quality even if small startups got access to their index data?