Hacker News new | ask | show | jobs
by seeknotfind 943 days ago
Many many orders of magnitude. It really depends on the search algorithm. If you're looking up a single term, there are many standard ways to index this, but for instance using a trie or hash table, the lookup cost would be a number of operations proportional to the search term length (e.g. for "foo", it's a of length 3, times some small number of operations for each letter). Plus then if you want to combine the results and sort by frequency, to estimate the cost of this, add up the number of pages from each term, p_1 + p_2 + ... = P. Then the number of operations might be P*log(P). This dominates the cost. So if your search terms appear in 1,000,000 of these pages, you're talking about on the order of ~6,000,000 small operations. An implementation would be able to execute this many operations on a single CPU in 10s or 100s of milliseconds maybe, for an elementary unoptimized algorithm.