Hacker News new | ask | show | jobs
by userbinator 2616 days ago
It really angers me that, despite the fact that it may be essentially perfectly what I'm looking for, if it was published long ago, Google may refuse to find it.

Something like a news search engine would definitely be better off prioritising the new results, but for something more general-purpose, it's an absolutely horrible choice.

I know this may be a bit of an edge-case, but I frequently search for service information or manuals for products that predate even the invention of the Internet by several decades. It saddens me that the results are clogged with sites selling what may really be public-domain content, and now I'm even more angered by the fact that what I'm looking for is probably out there and could've been found years ago, but just "hidden" now.

Of course, if you try harder, you'll get the infamous and dehumanising(!) "you are a robot" CAPTCHA-hellban. I once triggered that at work while searching for solutions to an error message, and was so infuriated that I made an obscene gesture at the screen and shouted "fuck you Google!", accidentally disturbing my coworkers (who then sympathised after I explained.)

3 comments

Google got where it was by being the best at finding what you wanted. I remember those days.

Google has a hard time getting me what I want these days, and sites I do find do things to get found that make me like content a lot less (that's you, inane story on top of every recipe required to get ranked)

Oh, is that why every recipe on the internet these days is prefixed by five paragraphs of waffling and photos taken from slightly different angles? Thanks, that makes sense, but somehow it never occurred to me that it was SEO. (It’s also reminded me that I’ve been meaning to order some cookbooks.)
There's a Chrome extension to fix that: https://github.com/sean-public/RecipeFilter

> This Chrome browser extension helps cut through to the chase when browsing food blogs. It is born out of my frustration in having to scroll through a prolix life story before getting to the recipe card that I really want to check out.

I find this. It has a bias towards large commercial news/ecommerce sites/daily fresh content. Great for the majority of browsers but for programming and other hobbies of mine the real content I want are in niche small blogs/forums that don’t get up ranked liked they used to so drop off the index.
Consider, though, that a lot of those big sites are there because of various tricks to push theselves up in ranking because they have money that the small, niche sites don’t.

The little guys don’t really have a chance, unless you have a search engine specifically biased toward them.

The only “trick” here is that the author of the website has a fake name. Otherwise, having good content that people link to is not a trick.
No, there was more to it than that. I spent some time, last year, looking into those pay-for-rating VPN "review" sites. And Google displayed some very odd behavior in searches involving TheBestVPN.com and VPN services that had paid it for top rankings.

I'm no SEO guru, but I suspect that some of those VPN services created numerous clones, which all linked to TheBestVPN.com, and so improved its ranking. For example, ExpressVPN had at least 128 clones. Such as expressvpn..., buy-express-vpn-..., get-xpress-vpn...., and xpress-vpn.... I used myip.ms to get hosting information, and they were linked. Also, I bought subscriptions from a few of them, and they all provided working ExpressVPN apps, with the right certificates. And I found no evidence of affiliate codes in the traffic.

Their A/B test told them to do it, without wondering if they should do it

Basically their engagement numbers were better for a larger amount of people by making search engines counterintuitive for early adopters.

We personally need a good robotic search engine that indexes like a robot. Everyone else needs a semi-sentient thing that makes many assumptions about what they want to see.

> Basically their engagement numbers were better for a larger amount of people by making search engines counterintuitive for early adopters.

Which also makes sense ... if you present the "right" result immediately, the user visits one site and has completed whatever he sought to do. if you make him click through 10 pages, he has way more chances to see an interesting ad.

Good points although in Google’s case the first several results are ads and their main users cant differentiate and dont care even if they could, followed by amp pages by the most engaged webmasters optimizing for relevancy

That user wants fingerprint based ads and recent articles

Google is optimized for that

We are the only ones that want a “search engine”, a service distinctly good at indexing the known universe, instead of merely presenting the paid and compliant universe

It seems like DDG is getting lots better :)
Lately DDG has started to ignore parts of my query just like Google do, or even worse.

I still use DDG as I find it generally less annoying but I really don't get why they too had to start behaving like the pre-Google search engines.

I've been dabbling with DuckDuckGo lately for this reason, whenever Google fails to find what I'm looking for. It found some C++ advise that Google failed miserably with. (Failing miserably on non-trendy programming topics is becoming an increasing issue.) Also, news overrides history all the time with Google. I hope you don't want to read up on Victor Hugo and his motivations for writing a certain book...because all you'll get it recent articles about Notre Dame burning.
i f hate that about top ranking recipe pages... i dont need the story or history.... i need oven temp and time and most of the ingredients.. Thats bloody it !!
Who’s passed Google in your opinion? As far as my experience is concerned, Google is still the best at finding what I want. If they’re still #1 they’re still holding up their end of the bargain.
No one.

Nobody has gotten better than Google, but Google has gotten much worse (and shaped the Internet for the worse with it).

It is a de facto search monopoly and without competition it rots. (or degrades to a symbiotic money harvesting machine between searcher and searched)

Google's strong preference for newer content is also kind of a middle finger to content creators. I have written many, many non-fiction articles over the years, and a large portion have been subsequently slurped up by these low-effort lazy-rewrite shops that just change a little bit of phrasing and call it their own. Google prioritizes these borderline-plagiarized, unsourced articles over mine just because the newer ones are newer.

Meanwhile my original (with the same basic information [which I researched personally rather than stole {not to mention I list my sources}]) languishes on page 4 of the Google search results. It grinds my gears on occasion.

I´m really sorry, this is what copyright should be for, but I´m sure it´s whack-a-mole and a load of money :(

FWIW I love your content.

What makes no sense to me about this blocking scenario is that the pages being searched for are presumably non-commercial ones that no one else is searching for. In other words, they are in low demand.

It follows that a monopoly search engine would have little reason to block "robots" from copying these pages, maybe to appear on some mythical competing search engine; almost no one is searching for them. The results pages would have dubious value in terms of attracting advertisers. They would not be seen by enough eyeballs.

With all the financial and technical resources it now has at its disposal as a result of selling advertising, this search engine still cannot accomodate the user who intently scans through page after page of results, looking for the needle in the haystack. Instead it prides itself at "knowing what people are searching for", i.e. what they have searched in the past, thus being able to offer fast, "intuitive" responses.

It may be that the search engine was designed and is optimised to prioritize repeat queries, i.e., searches for pages that are sought by numerous people. It may also be true that it has been configured to "limit" the resources it will devote to searches for pages that few people are seeking. Perhaps through CAPTCHAs and/or temporary IP bans.

Practically speaking, it could be that there are no significant advertising sales to be made on the results pages for queries that are being submitted by only one or a very small numbers of users.

This is all pure speculation of course.

In other words, true democracy kills any chance for individuality.
Can you elaborate?