| What is needed but AFAIK is never discussed is an objective index of what is on the www. (Not a cache of the entire www, and not a full-text search engine. Rather, an index, similar to what is in the Yellow Pages (subject, alphabetical), but going a little further. For example, each site might submit a list of say 5 selected "permanent" URLs where a user could retrieve site information.) This is not an insurmountable task. And it need not be conducted by a private company. A significant amount of the work is already done with respect to sites that register domain names, via zone files. But this is only start and is not comprehensive. Back in the day, early search engines required operators to submit www sites to the search engine. That active involvement of www site operators seems to have been lost. There could well be a publicly-run directory service for the www. Operators could submit their site to a public agency instead of a private company. Or at least make it easy for a very simple crawler to retrieve a sitemap.xml or some file with a standard format for disclosing site information. Private companies have difficulty policing overzealous marketing and fraud in such situations. Today we have one company using "secret algorithms" that supposedly address the situation. But if a site is submitting information to a governmental agency instead of a private company maybe it becomes a little less easy for marketers to bend the truth. There is more opportunity and incentive to enforce the consumer protection laws. Better for consumers. Users could still access Google to determine popularity of a given www site (or "relevance" if you believe that popularity has some bearing on relevance). Keeping in mind that Google is a private company that encourages a bidding war between advertisers for a spot to the right of the top popularity ranking for a given search query. The behind the scenes of the auction process is opaque. Google has no incentive to be wholly objective. Give users more choice how to look up www sites. (Note this is a little different than full text search. It is far less complex.) Site discovery: This is a fundamental problem that is occasionally discussed. Site discovery. All those sites users never learn about because of search engine schemes like "PageRank". We see the same phenomenon in an "App Store". Top 10 are promoted excessively. All the rest are never discovered by the vast majority of users. Perhaps the only reason someone can make large sums through selling an app is because if they can get into the top 10, then all other apps are effectively hidden from most users. This dynamic creates a certain hype and draws in more contributors all trying to get into the top 10. Each paying fees to the company behind the "App Store". Can we apply a similar analysis to Google search and the sale of AdWords? What might fuel demand for ads? The lure of a #1 rank or an ad to the right of it? Getting back to the issue: Let user/developers work with a free, objective index not produced or manipulated by a private company. I can think of many ways to build efficient search i.e., www site discovery, using such an index. I believe others would have even better ideas. We already have a privately-held cache of the entire www. What we still need is a publicly-accesible index into that cache so that users can discover www sites by means other that popularity. |