Hacker News new | ask | show | jobs
by re_format 5270 days ago
The fact that they have to make changes to their system in order to not have useless crap appear at the top of the results tells us something: either people are searching for crap or the portion of the web Googlebot is crawling is full of crap.

Neither is something the search engine can fix for you.

With respect to the later idea, the search engine may in fact be contributing to it by encouraging more crap to be created, because it easily percolates to the top of their "intelligent" results and users blindly click on result #1. And no doubt many users see these results as equivalent to "the web". Whatever Google returns, to them, that's "the web".

You can think about the web through the lense of "search engine results" and evaluate the web based on whatever is returned from your search engine queries.

Or you can think of the web as a huge mess of websites some of which are useful, most of which are crap and many of which an aggressive search engine might index.

Are you evaluating search results, or websites?

I'm evaluating websites, individually. Because that is what the web is. To me, Google is not the web. Google might give me some clues about some sites. They do an enormous amount of grunt work crawling them.

But it's up to me to do the final evaluation. To decide whether a site is useful or whether it is crap.

And there are other ways to discover websites besides using Google. How do you think Google learns about existing and new websites? Voluntary disclosure by the webmasters?

It sounds like you want someone to evaluate websites for you. I doubt you are alone in that regard.

This is not a new problem.

However, unlike you, I do not see Google as providing any viable solution.

1 comments

The fact that they have to make changes to their system in order to not have useless crap appear at the top of the results tells us something: either people are searching for crap or the portion of the web Googlebot is crawling is full of crap.

No, it means the ranking algorithm is evaluating the results wrongly. Which is what they're trying to fix.

With respect to the later idea, the search engine may in fact be contributing to it by encouraging more crap to be created, because it easily percolates to the top of their "intelligent" results and users blindly click on result #1. And no doubt many users see these results as equivalent to "the web". Whatever Google returns, to them, that's "the web".

But that's the point, isn't it? It shouldn't easily percolate to the top. That's what their algorithms are for. If it does, they need to be fixed.

Are you evaluating search results, or websites?

I'm evaluating websites, individually. Because that is what the web is. To me, Google is not the web. Google might give me some clues about some sites. They do an enormous amount of grunt work crawling them.

But it's up to me to do the final evaluation. To decide whether a site is useful or whether it is crap.

I don't get what you mean by "Google being the web". Of course the final evaluation is up to the user. But if Google can rank the results more like you would, you're wasting less time clicking through the crap to get what you want.

And there are other ways to discover websites besides using Google. How do you think Google learns about existing and new websites? Voluntary disclosure by the webmasters?

Actually, they do that too. But mostly by painstakingly loading every link recursively, something which is obviously impossible for a person to do unless they want to be limited to 0.0...01% of the web.