Hacker News new | ask | show | jobs
by rexreed 1857 days ago
Search quality at Google has been decaying over the past decade. Accuracy and quality of search results is compromised to optimize advertising revenue, penalize competitors or neutralize threats, and cater to the various needs of political or regulatory authorities.

Google's search was at its peak in 2008 when advertising hadn't fully compromised search quality. Google is an advertising business that supports its otherwise money losing properties. Why will things change in the future because you can synthesize data from multiple sources only to compromise that quality with the realities of Google's business model?

5 comments

> Search quality at Google has been decaying over the past decade.

Is there any empirical evidence to back this up? If we’re talking anecdote, I swear as soon as google started labeling ads more clearly people complained more about ads. And if google really is getting worse, I would expect that I would get frustrated with DuckDuckGo bit getting the job done less often.

I do share your concerns though. Just look at YouTube as an example. You search for something, and half way down the page are completely unrelated videos that you watched before. This is because YouTube just wants you to click, they don’t care about you finding what you were after.

The one example I usually give people is the one that led me to the realisation myself.

Try searching "how valve index works" or "how valve index controllers work". My interpretation of "how it works" is "technical information on how an item operates". Google will interpret this instead as "how well it performs its intended functions" and flood me with both links to purchase the Valve Index as well as endless reviews. Results on Google are not tailored toward retrieval of factual information anymore. They're tailored to ordinary, garden-variety consumers, and obviously designed to sell you a Valve Index.

To this day I still have not found really good information on how the controllers in the Valve Index actually work. All I get are pushes and nudges into getting me to buy something.

Those are good examples! I'll pass them along to debug. I think what's happening is that the wording is ambiguous enough that it's colliding with concepts like "how well does valve index work." If you search for "how does valve index tracking work" then you get results like this, which is more in line with what you're looking for. https://gizmodo.com/this-is-how-valve-s-amazing-lighthouse-t...
While this mic is on, also fix the time ranges for some queries. If you search “best Wordpress plugin for exporting data” , Google often gives me 4 to 5 year old links. Current top result is from 2017... where some plugins don’t even exist anymore :)
I think what happened was they pivoted from _document search_ to an interactive oracle app. I would have used “index controller principles” to get the documents describing it, which no longer works. And I think what you want is the document search back.

And these days they throw a lot of machine translated ripoff sites as well as some malvertising dummy type sites. It’s really something.

> I still have not found really good information on how the controllers in the Valve Index actually work.

Isn’t the Occam’s razor explanation here just that that information is not actually available on the web - not that Google is hiding it from you?

Not in this case:

See the first page of results for DDG's search on how valve index works:

https://duckduckgo.com/?t=ffab&q=how+valve+index+works&atb=v...

Compare to Google: https://www.google.com/search?q=how+valve+index+works

On Google I get some Wikipedia extracted information that says:

"The Valve Index Controllers have a joystick, touchpad, two face buttons, a menu button, a trigger, and an array of 87 sensors that allow the controllers to track hand position, finger position, motion, and pressure to create an accurate representation of the user's hand in virtual reality." with a link to Wikipedia: https://en.wikipedia.org/wiki/Valve_Index

The first raw result is: https://www.pocket-lint.com/ar-vr/news/steam/147913-valve-in...

That's the same link as DDG uses as its first result.

The second Google link is a YouTube video (https://www.youtube.com/watch?v=bD8Y9gcPGzs) that has details about "optics and resolution". The second DDG link is about sprinkers (https://www.sprinklersavings.com/blog/how-an-indexing-valve-...).

Google seems a lot better on this query.

Then they should just come out and say it: "no results found".

Not returning results at all seems to be stigmatised these days for every site.

Not being correct is stigmatized in all aspects of society anymore, thanks to ever increasing business leadership mentality invading our culture. Being wrong, failing, etc. is no longer acceptable. You have to provide the appearance of success in the absence of success.

I don't know why we as a culture can't accept that people fail and fail often. A bit more humility would do everyone some good instead of setting constant unrealistic expectations that hampers all aspects of society. It's completely bananas.

I'll give you one: Google image search is so insanely hobbled by the copyright squad (and possibly right to be forgotten, etc) that it's essentially worthless now. Reverse image search used to be a valuable tool. Now it just spits out generic garbage, even for images that clearly have a wide presence on the net. These days if I need to try to hunt for something, I just pull up Yandex and get the results Google used to give five or ten years ago (better even, since there's a bunch of neat added features like object recognition and automatic OCR).
I have an example from just yesterday. I am new(ish) to the rails ecosystem and spotted a `.ruby-version` file in the root of the repository. I didn't know what it was so I googled `.ruby-version`. The results were less than helpful because Google interpreted that as a search for the term `ruby version`. Fine, whatever, I will just fall back on double-quoting the whole thing, like `".ruby-version"`. A couple of years ago this would have worked perfectly - I know, because I've been doing it for years. But Google no longer respects this kind of search query, instead it tries to be too clever by half and end up being worse than useless.
I miss the days when punctuation marks were significant to google searches. And the days when you could use logic operands in searches like +&!

I’m glad I learned POSIX and especially Linux when searches were evaluated more literally. It was simple to locate relevant technical pages.

It’s a shame google doesn’t offer legacy search.

I can only give you anecdotal evidence, which is that myself and many others (per social media) are constantly appending "reddit" because the first N results are all e-commerce sites or thinly-veiled promos for them in the form of listicles.

e.g. Search for "camera with wifi" versus "camera with wifi reddit". If you're doing any research, you will find the latter more useful. Now I know some will say many people just want to buy the product and will be satisfied with a direct link to purchase, but the thing is a good search engine will mix in different types of results. What you get here is dozens of virtually identical results with any genuine info - e.g. a recent post on a reputable personal blog or a social media post - completely buried.

Do any other engines do it better? Maybe not. But Google itself certainly used to do it better, if only because it didn't have the majority of the internet trying to game its algo.

At this point, I basically need to know or find an authorities source first. PCMag still appears to be a good resource, moreso than Tom's Hardware and Wirecutter at times (I think.) It's sort of the same shit, but they seem to put a little more work into being right. Too many listicles that are "10 best" are really "the first 10 the author saw while searching." When coronavirus started, theres no way anybody writing most of those "review rollups" ordered and tried on any of the masks they assembled into posts. There are fewer and fewer places that seem to be trying things themselves before recommending them.

https://www.pcmag.com/picks/the-best-sony-mirrorless-lenses

Google really needs an authoritative mode that strips out or deduplicates the news cycle and blogsphere. Something that can tell that every post is basically the same thing and turns it into one entry. I want uniqueness and quality. I dont need the same opinion repeated across 10 urls.

A CTRL+F of the MUM page didnt find the word duplicate once.

next generation of search engines should have a config where people can customize their algorithm.

I don't feel that google did worse over the years, more like the commerce part of the internet overtaking the information part.

Actually you make a good point, since google has a shopping tab maybe they should show ads over there only and dedicate the "normal" google to general info

Not to mention that most of those listicles are an extremely shallow cross-section of available products. Reddit is far more willing to suggest off-the-wall options like used 5 year old hardware that still performs better than the newest shiny, and uncovers far more slightly options that are slightly off the beaten, consumerist path.
And also, a search engine with a greater bias on UX would be more personalised, so it would show those kinds of results to the people who regularly seek them.
It’s funny Google Search still does good at what they were intended for, searching intelligence by keyword to gain understanding.

Google neither care to confirm or deny but the origin of Google Search is reportedly some CIA/NSA internal program. Imagine there’s a ton of random Soviet documents, and you wanted to know what the codename chikensandwich in Slicebread division might refer to, or which document is referred to the most from other documents regarding the topic. Don’t you think, Google Search as you remember it does exactly that.

And this conspiracy theory explains why Search, Maps and Mail and very few other products built by such a laid back disorganized organization work so well and only those work well, that it’s because those are technology dump from NSA and Google is just an elaborate museum shop allowed to capitalize on their heritage.

1: https://qz.com/1145669/googles-true-origin-partly-lies-in-ci...

I came here for a technical discussion of MUM, but your comment just triggered something: my wife and I pay for ad-free YouTube (part of the music bundle). As a paying customer, I just don’t understand why they would annoy me with showing videos that I have already seen. A better UI would be a top level menu option to show history of watched video (and search writhin already watched material). Then the default page could remove already seen material.

I am a happy paying customer for GCP, Play Books+Movies, etc., but I think they need to step up the quality of their services for paying YouTube customers.

Thanks for your comment.

Not likely to find empirical evidence of search results quality, but I think there might be for an overall lowering of content quality. It is so much cheaper to mass produce unimpressive content than ever before.
How about image search? In 2008 there weren't product images and shopping campaign ads inline with the rest of the results. Also reverse image search is now being supplanted by Google lens search, which again serves up products and ads bases on what can tag in your photo.
Huh. Just ublock all those annoyances away. Didn’t know Google search is so ad ridden.
> If we’re talking anecdote, I swear as soon as google started labeling ads more clearly people complained more about ads

When did Google start labeling ads more clearly? See: https://searchengineland.com/figz/wp-content/seloads/2016/07...

Today the labeling consists of the letters "Ad" in black next to the result.

source: https://searchengineland.com/search-ad-labeling-history-goog...

Yes the number of ads at the top of the page has increased. The colors have blended ads into content. The number of sites shown is reduced. The amount of content indexed available has been reduced.
> Is there any empirical evidence to back this up?

No. This is the same HN post as "Facebook is dying, pretty soon all their users will be gone and they'll collapse".

How much of it is Google getting worse and how much of it is garbage websites hyperoptimizing for SEO? Practically all news websites are chock full of ads. There's tons of filler websites that just copy/paste text from Wikipedia, etc. Of course, Google could do a better job, but it's codependent evolution.
It's Google's prioritization with ads and preferred sites taking priority even over those SEO-optimized sites.

Google would much prefer to be the sole source of your traffic instead of pushing you to other sites. Google's business is advertising. Why would they want to lose that traffic?

Check this article about the Google MUM announcement, which basically says the same thing:

"MUM is part of Google’s long-term shift away from ranked search results and toward the creation of AI algorithms that can answer user questions faster—often without ever clicking a link or leaving Google’s results page. (Think, for example, of the “knowledge panels” that now appear at the top of many search results pages and display an answer from a website so you don’t have to visit the site yourself.) This shift promises to reduce the amount of work it takes to find information through Google. But it’s not clear that this is a problem in need of a solution." [0]

The Google of today is not the Google of 2008. Google in 2008 was a search engine. Today it's an advertising business that would much prefer you not leave Google properties.

[0] https://qz.com/2010802/googles-mum-is-making-search-worse-by...

> This shift promises to reduce the amount of work it takes to find information through Google. But it’s not clear that this is a problem in need of a solution.

Getting people useful information faster is the problem in need of a solution when you're Google. There isn't a point where that problem is solved; organizing the world's information and making it universally accessible and useful is an unbounded goal.

Is this just speculation on your part or do you have a source for this claim?
What speculation? I quoted the above article stating that, and there's more on this topic. There's a reason for those Infocards.
FWIW I upvoted you upthread to try and counter the inexplicable downvotes; to me the points you made are uncontroversial and almost self-evident. (shrug)
Garbage websites hyperoptimizing for SEO have existed since the late 90s. I agree with the GP, the issue I have seen over the deterioration of search in the past 5-10 years is specifically a result of their business model:

1. Any remotely commercial search has an entire first page of ads, organic results are pushed way down.

2. Google has made it difference between ads and search results as minimal as possible. I long for the days of the early 00s of big yellow boxes.

3. On many pages the amount of content Google stuffs in at the top before you get to actual search results gets more annoying every year.

Honestly, I wish I had a button that made Google result pages look like they did 15 years ago.

I feel like the main problem I have with Google results is that they never surface anything interesting or old. There are a lot of searches where it returns nothing useful, but if you add "reddit" it becomes useful.

Besides that, they haven't fought SEO enough on image search, since Pinterest took it over for years.

Totally agree with this. If you're searching for something where the keywords happen to conflict with a current event, good luck finding it.
> I wish I had a button that made Google result pages look like they did 15 years ago.

Or a browser extension.

How much of it is Google getting worse and how much of it is garbage websites hyperoptimizing for SEO?

Those are the same thing. If garbage websites can game their way up the search listing then Google is failing.

This is a simple problem of competition. Google doesn't have any, so they don't need to provide a good product. They can optimize for ad placement and revenue instead of search quality because users perceive that they have no real choice but to use Google. If another search engine manages to get some real market share Google results will get much better again.

Idk in the sense I feel that google has been doing better in fighting SEO in over time. I used to get crap results, but then again I was less experienced and did not use ads blockers
I would have agreed with you a few weeks ago. I recently switched my browser to the new Edge and I stuck with Microsoft's default Bing search because fuck Google and all that. I had two occasions in two days where Bing's search frustrated me with their results despite many efforts to tweak the query. I switched over to Google and its first result was exactly what I needed. These were cases where the page didn't contain the phrase I was looking for so it had to interpret/translate it to find the correct information and it did a great job.

A few years ago I felt that Bing and Google search were basically on par. Google has definitely upped the ante regarding search in the last couple of years. It may just be that it does more interpretation than you've come to expect so you need to retrain yourself how to query it. There are also occasions where verbatim search is required for technical topics. But Google's search quality has shown real improvements.

> Search quality at Google has been decaying over the past decade.

This one line is echoed again and again on HN and yet in my experience all its competitors still pale in comparison. I hate Google now as much as the next HNer for its evil shenanigans but their search is still superior and if a browser comes with a default like Bing or Ddg (like ff on linux mint) the first I do is change it back to Google since the results are truly aweful otherwise.

But it is worse than it was a decade ago. Across the board. There's more pollution than ever on the internet, and search engines are doing a worse job of separating the diamonds out.
Just look at the number and size of ads. Quite often, the entire first page of results is ads, and you have to scroll down to find the organic results.
Yeah - seems to jive with my experiences. It's a tough pill to swallow, but bm-25 and tf-idf along side pagerank continue to be superior to dense vector methods for search. Even dense-vectors with re-ranking models afterwards don't perform as well. I've been sad to see that models like BERT are becoming more prolific in search as they are a significant portion of why googles search has gotten worse...
They key here is that transformer-based "search" isn't actually providing links to the sources of information such as how search works now, but rather synthesizing information as a result of being trained on the corpus of Internet data.

In this way, Google gets all the value from Internet properties they don't own without having to push any traffic to those sources. So, they get their cake and eat it too. They create a way to regurgitate information from the vast trove of info on the Internet without ever having to share traffic with those sources by moving traffic from their search engines to those sites, like they do now.

They get to sell advertising to those who want to capture eyeballs for search results, without having to share any ad revenue with the content providers that are powering that transformer-based search.

Ain't it grand?

This has been coming for some time now, to be fair.

Now that it's pretty close to actually being here, the grim reality is that anyone who was expecting the status quo to just march on like always is going to get screwed over; and the a new wave of successful businesses will adapt to it and thrive.

It's called 'disruption', and it's a bit disappointing to see people here of all places complaining about it.

Sure, I get it, it's google, and if it was some nippy unicorn doing it people would be more enthusiastic, but ML is hard to do right, and having someone who's actually pushing the boundaries of whats possible is, in my opinion, pretty cool.

BERT made a huge contribution, and if this eventually flows out to everyone else to use, that's great news.

...and, if google stops sending traffic to some websites, well, too bad. We'll adapt; so will others.

The ones that can't will disappear.

Disruption is a descriptor, not a moral imperative.
> They get to sell advertising to those who want to capture eyeballs for search results, without having to share any ad revenue with the content providers that are powering that transformer based search. Ain't it grand?

Reminds me of spammers making spinned articles.

Source? I am quite intrigued by this anecdote for information retrieval.