Hacker News new | ask | show | jobs
by mcfedr 1590 days ago
I never understood why Google isn't blocking these crap results, it's really making my experience of search really bad for a light of my searches
6 comments

They used to be superb on detecting duplicated content. They also were extremely good at detecting spam/ham. Nowadays it feels like they don't even care anymore and whatever filters they have are either broken or untrained.
Bring back the panda, I say!

https://en.wikipedia.org/wiki/Google_Panda

It is an arms race.

I wonder how many players on the page generation side? The economics of it must be marginal, I guess

Copycat sites also used to be extremely careful at not appearing copycat sites. Or not duplicating content on the same site. I am surely not alone in recalling the old mantra of not duplicating content.

Copycat sites don't seem to care anymore.

I don't believe there are an overwhelming number for Google et al to deal with as it's often the same names topping search results that such filters can remove through semi-manual user action.

While leads to the conclusion - Google don't care about duplicate content any more.

Were they? I remember having to manually block myself a lot of those copycat wikipedia/stackoverflow sites back in 2011 or 2012 when they had the domain-blocklist option available for users. When the feature was removed, it all came back.

Maybe the problem is just that there are more of those now.

Google removed that option without even trying to spin it as a pro-consumer change. The only problems I can think it brought to Google are clueless users complaining that they can no longer see microsoft.com in their results, and having a negative impact on unethical advertisers.
I had lot of problems with "duplicated content" from sites that published the same content as I did and outranked my site.
Do they do a good job at getting clickthroughs on Google ads on their site? :-/

Does the rate of ad-clicking on the results page increase if most of the "natural" results are crap? :-(

I've noticed a recent trend where the copy cat/adware sites are "up-ranked" relative to original content. This would be the expected behavior of a search engine optimizing for clicks and revenue.
Don't be evil
>Dont, be evil

For an Alphabet company, they sure don't know where to put the apostrophe

I think they shut down that app.
Part of the problem might have been that Stack Overflow has been busy shooting themselves in both feet for years.

For a while (maybe around 2012 - 2017) or something it felt like it was almost the rule that if you found a really useful question on Stack Overflow it would always be marked as llw quality.

Eventually I guess they were pruned and that might explain a bit of why they rose.

They annoyed mee too though as they often mixed together unrelated questions on the same page and get hits for very specific queries that are unrelated.

They should give the YouTube audio fingerprint team a shot at it.

But seriously, Google doesn't need to make anything besides bringing back the option to hide certain domains from the results forever. Even if they don't analyze what domains people are hiding, it would dramatically improve the usability.

Google started adding other quantitive measures for ranking results. That's how some of these crap web sites manage to rank so high.
They probably are. But the clone sites are designed specifically to avoid being blocked by google.