Hacker News new | ask | show | jobs
by dspeyer 5606 days ago
80/20 doesn't really apply here. The whole point of search is to handle the obscure stuff.
2 comments

Yes and no. Pretty basic spelling correction is pretty good, and as Google makes its spelling operation more sophisticated, I actually find myself needing to reverse their corrections more often.

Indexing is more complicated. You could argue that 20% of Google's index contains at least 80% of the information people need. The problem, like the old advertising saying, is figuring out which 20%. So if you have a clever way to address content quality and uniqueness, suddenly your crawling and indexing costs plummet.

While it may be argued that `the obscure stuff' is harder to find or may be more valuable/important, I'd disagree with you.

First, the bulk of the search seems to be for the usual stuff. Thus the bulk of revenue from ads for the provider and _perhaps_ the bulk of added value for the users stems from searching for the usual stuff.

Second, as the obscure stuff oft uses uncommon terminology and names, it's technically easier to search for it and display relevant results. The trick is to give useful results for the most common things. Consider how hard it may be to automate (!) picking sensible results for words like `the' or `free'. Or for dates that are represented in one zillions of formats. Of course various approaches (like lists of & criteria for stopwords) have been implemented.