|
|
|
|
|
by mikaraento
301 days ago
|
|
Around 2008 a core step in search was basically a grep over all documents. The grep was distributed over roughly 1000 machines so that the documents could be held in memory rather than on disk. Inverted indices were not used as they worked poorly for “an ordered list of words” (as opposed to a bag of words). And this doesn’t even start to address the ranking part. |
|
Wikipedia [1] links to "Jeff Dean's keynote at WSDM 2009" [2] which suggests that indices were most certainly used.
Then again, I am no expert in this field, so if you could share more details, I'd love to hear more about it.
[1] https://en.wikipedia.org/wiki/Google_data_centers
[2] https://static.googleusercontent.com/media/research.google.c...