| I'm confused. By "clever indexing" I thought they meant, in the database sense of the word. The reason my search took 30 seconds is because it started by getting a list of every site with "from" on it, every site with "what" on it, and so on, intereseecting them all. That's how it ended up finding my quote. how else do you think it did it? -----
edit: to find the string "from what it is to a" which occurs only hidden in the middle of shaespeare's texts -- what do you think they do? In my opinion they combine the list of sites that have every word - starting with the least common ones. It's easier if you search for something that has a few uncommon words. Then you start with a small list, and have to combine it with other small lists. When every word in the phrase has billions of sites (there are billions of pages that have the word "to" on them, same for "from", "what", "it", "is", "a"), you have to combine them all. Then you have to do a string search within the resulting set, since I put it in quotation marks. There is no easy strategy. Hence the long search time. what else could they be doing? |