Hacker News new | ask | show | jobs
by superasn 1067 days ago
It's so weird that Google still hasn't created the ability to block spam domains.

I don't think it's a technical difficulty but rather a management decision to allow spam as long as they have their adsense ads and it baffles me they just don't care about their end users at all.

I think being a monopoly with most competitors lagging a lap behind can make you this way, but with chatgpt catching up quick, Google better start thinking about their users now as it's a growing sentiment that their quality is totally bad nowadays and their uncaring attitude towards spam domains like Pinterest and spam search results where sites are creating hundreds of pages with same content and different heading is just not gonna cut it anymore.

11 comments

Most of these spam content farms are covered with ads supplied by Google. The incentive isn't necessarily there to remove them.

On top of that the worse the search results, the more likely the user will click an ad rather than an organic result. The Google of old wouldn't have been tempted by that incentive, but that Google is long dead.

Bad search result = more revenue from ads.

Obviously it's a fine balance, they don't want to loose users, but they will have the metrics to (religiously) work from.

As someone who used to run ad campaigns on Google until a couple of years ago, they are doing the same to advertisers. Users will happily click an ad, go back to the result and try another, many many times. Google have systematically made the advertising on search results worse, showing ads more regularly for poorer placements, removing control and visible auditing. It's all so they can extract more revenue from the advertisers.

This is such short term thinking imo but then what do i know. People who are paid millions of dollars are making these decisions at Google so maybe that's how the game is played. I still think treating your customers like the way they're doing it now eventually never always works out.
When you have a metric driven company, where peoples career progression and bonuses are tightly tied to revenue numbers in a database, you incentivise this sort of thinking on an employee level. The visibility of the impact isn't necessarily there at the top.
Not just a metric driven company, but one that's so large the left hand doesn't know what the right is doing and neither of them could optimize their jobs for the other even if the incentives were for them to do so.
> This is such short term thinking imo but then what do i know.

That's because it's completely inaccurate. It's just a meme propagated by HN and some others, with essentially zero correlation to how decision-making and prioritization actually happens over there.

The idea that clickbait is somehow good for Google's bottom line is absurd on its face, before we even tackle the idea that ads' interests are controlling search ranking

> The idea that clickbait is somehow good for Google's bottom line is absurd on its face

I guess I am simply unable to see so I will ask: why do you think the idea as outlined in the GP is so illogical?

>The incentive isn't necessarily there to remove them.

There is a reason people are using chatgpt as a replacement for google, or appending results with Reddit or Wiki

I'm waiting for a day there will be a capable LLM agent that would be able to crawl hundreds of results, read them all (working around all the obstacles, not spewing "reading content failed"), filter the marketing noise/SEO copypasta bullshit, find the meaningful bits, and present a nice summary.
Current LLMs can go crazy just by seeing a single well-chosen word, see https://www.lesswrong.com/posts/aPeJE8bSo6rAFoLqg/solidgoldm... for an amusing exanple. I think we need some breakthrough in adversarial training before LLMs can survive the very adversarial environment of SEO.
They actually did have it as an experimental feature for a little while, maybe 10 years ago. [0]

If I recall correctly, it was made available in response to another big wave of criticism directed towards Google about "Content Farms". It has been interesting to see the difference in response between the "Content Farm" debacle and what we are dealing with currently.

[0] https://googleblog.blogspot.com/2011/03/hide-sites-to-find-m...

Pretty sure 4chan is knocked down quite a bit. Sometimes when I search 4chan gif or b, I get a page of unrelated websites.
If anyone wants this capability, the chrome extension uBlacklist ( https://chrome.google.com/webstore/detail/ublacklist/pncfbmi... ) provides it. I've found it very useful for removing github scraper sites from search results. Whenever you see a garbage result in a google search, you just click "Block this site" and it's gone forever.
> I don't think it's a technical difficulty but rather a management decision to allow spam as long as they have their adsense ads

It is absolutely this, yes. They tell people where to go, and they also profit from the ads they put on the places they tell people to go, and they have no real competition[1]. It has made them one of the most valuable companies on the planet. There is no incentive for them to rank higher quality results over ad-filled pages.

We need to break up big tech.

[1] https://gs.statcounter.com/search-engine-market-share/all/un...

> they just don't care about their end users at all.

They do, a lot actually. We're just not the end users. The advertisers are. Thet don't need to care about us since they're a monopoly. Use Brave Search, DuckDyckGo or something elae to help change that. Get other people to too.

I tried using DuckDyckGo (lol) for a few weeks recently, but its results are just as bad or worse than Google's are usually. It's the same problem of endless worthless listicle noise.
Yeah, I used it for years and it used to be better but it turned pretty trashy, which is why I switch to Brave a couple of months ago and I'm pretty satisfied now. Still glad DDG exists though as an alternative protest vote.
> it baffles me they just don't care about their end users at all.

Just for you, especially for you in fact, I went and dug out this link to an interview with Corey Doctorow where he explains his theory of platform of “enshittification”.

https://podtail.com/en/podcast/future-tense-full-program-pod...

I think it will interest you to hear a reasonable theory as to why Google doesn’t care as much about their end users as you might think they should.

Interesting concept. Thanks for sharing!
I'm talking out of my ass here and know nothing about how search actually works in 2023, but I feel like their search problems really are solvable with their scale.

Like you said, a first step would be blocking spam domains.

Another would be a return to how the algorithm used work, prioritizing results that are frequently linked to elsewhere using the relevant keywords. In the '00s, this was quickly gamed by companies setting up endless blogs to link to their own products, but I feel it could be mitigated by assigning some kind of trustworthiness score to the sites doing the linking. It shouldn't be hard to recognize that a site being recommended a lot on Reddit or some other well established repository of user-generated content is going to be more genuine than "recommendations" from some random blogspam site nobody's heard of.

You’re making a few bad assumptions.

1) Google and you are defining spam differently.

2) Spam is adversarial. They have the ability to block spam domains from last week, but it’s an ongoing investment as the adversaries adapt.

That said I do think Google depends too much on human raters who are not customers. It’s caused a lot of drift between the customer’s expectation of quality and Google’s.

I always figured it's because there isn't a real definition of "spam domains". Google can't come up with a set of rules with the right error rate.
> It's so weird that Google still hasn't created the ability to block spam domains

by who??? Google itself "blocks" or downgrades spam websites all the time. If they open it up to "the people", the same SEO crowd will rush in with scrapers to decimate competition, including any legit websites. So strange to see people jumping into conspiracy theories here without thinking of the consequences for a second.

I am pretty sure OP is referring to the ability for an individual, logged in user, to maintain a list of "don't show me results from these domains" on their Google account.
ok. didn't even think about that use case because I'm never logged on while searching. More bizzarely however, with 300 million domains out there and more added every second, you are in for quite a chore! even more delusional
How long until chatgpt injects an advertisement every other paragraph?