|
EDIT: I've noticed all the replies and I'd like to acknowledge them. Unfortunately I feel very stupid for not screenshotting what I saw when I searched one hour ago. I now see 62,900 results, and I can load up to page 6. I can't prove that I was not able to load page 2 before, but it's true. My original comment remains unedited below. -- For a concrete demonstration of pathological de-ranking, do a query for "site:web.archive.org". I get "59,000 results" on page 1, but page 2 will never load! There are a few results, which proves that a) web.archive.org are not using robots.txt or other blocking techniques, and b) that Google's infrastructure is inhaling content. But it's invisible. Think about how sad this is - once a site goes dead, it's offline, even though the content is still publicly accessible. If only that context was indexed using a decent search engine. Practically speaking, I totally acknowledge that archived content is complex to surface; sites can be pulled offline because content needs to be disappeared for any number of reasons, etc. I recognize the general difficulty of getting this right. So I'm not _really_ arguing "if only this were surfaced", because it's unfair to - I'm more saying "hey look, this is what it looks like when something has been completely killed," as a demonstrable and extreme datapoint. |