Hacker News new | ask | show | jobs
by jongjong 749 days ago
Google should change their algorithm to rank websites randomly; they all show up in search results with equal probability, so long as they exceed a certain threshold of relevance for the user's keywords (the threshold could vary for different keywords but would be made public and there could be instructions on how to meet the threshold requirements so it doesn't have to be a secret and anyone should be able to get their sites showing for at least one set of specific keywords). That would make it impossible to game. Maybe they could have 5 slots in a side container for 'Top trending' for those keywords for the current day, week, month or year (the user can choose the granularity). Problem solved.
4 comments

You would game it by creating more websites.
Other others have stated below this does in fact become a cat-versus-mouse Sybil attack scenario where the barrier to entry isn't high enough to stop a bad actor from creating many websites. Like online identity and reputation has to be tied to more than just an email address.
But would be difficult to build a lot of websites which all meet the threshold for specific keywords. The thresholds don't have to be particularly low. In fact it's better if they require a certain amount of work to meet. So maybe only a relatively small number of websites would qualify for a specific niche keyword but the idea is that, among those, they are ranked randomly. You'd probably have to use AI to figure out site quality in niche areas.

Or Google could go with a lower risk approach of keeping their results as they are with their current algorithm, but only randomize 3 slots out of the top 10 based on this new threshold approach.

Do you remember those autogenerated websites that were just giant lists of all words? Those disappeared many years ago, but if you made search ranking random, they'd come right back.
In 2030: Do you remember those websites storytelling about their grand mother just to introduce a mathematical theorem? We’re so lucky they disappeared like the giant lists of all words, because they were 100% fabricated by Google’s unnatural incentives.

Google has the ability to change the face of the internet in 2-3 years. They can detect the chaff and shut it down, and I wonder whether it’s an anti-competiton feature that they require that websites write a thousand words per page.

I asked ChatGPT to tell me how to get away with murder in the style of a recipe blog and it (surprisingly) did a bang-up job: https://chatgpt.com/share/b738b68d-8294-4a2c-87ff-f95a6e2d91...

I did this after simply wanting to know how much powdered sugar to put in whipped cream and getting frustrated at trying to scroll through 3 blogs just to find the ingredient list for something so simple. Eventually I just asked ChatGPT.

I wonder if Google can start running an LLM on websites to judge them on things like that. Hell, looking for a photographer in your area? Have it judge how good the photography is on each website. The possibilities are there but I don’t know if they’ll bother.

Your link doesn't seem to work.
> Do you remember those autogenerated websites

Still many copy/paste sites around. Crawl data, put a skin on top, publish on stolen domain to make it legit, clickfarm away!

I honestly think the problem can't really be solved because of the adversarial relationships involved. But if there was more than one search engine with significant marketshare maybe it would be easier to route around the problem.
Why would it be difficult? Just copy paste content to different domains. And done. And for example if google decides to down rank sites that have same content on different domains, well, then you have a nice weapon against your competitors, just copy their sites lot of times and you got your competitor removed from google.
It's a game of cat and mouse, and apparently all the “this is easy” people think they're just smarter than everyone out there.
One of my buddies that got into SEO a half decade before I did mentioned the copy and paste rankeroo stuff was real popular back in the days of Infoseek, Altavista, Excite, Lycos and similar.

Google looks for the canonical version of a document and then deduplicates before returning the result set.

You can add &filter=0 to the end of the search URL for a particular query to turn off the duplicate content filters.

An old school spam technique for some affiliates in the early days of Google was to buy a high PR link to their affiliate URL so that like site.com/?aff=123 would be the default version of the homepage & the branded searches for the merchant would then owe the affiliate the commissions until the rankings shifted again.

Well surely the algorithm can detect duplicate content. Also Google should focus beyond content and consider user satisfaction metrics to decide what is above or below the threshold. Maybe AI can help with all these things?
> That would make it impossible to game.

Have you considered that it would make it also pretty lame user experience?

Hold up, I think he might be up to something for when he discovers to order an array in O(1)
The golden times were 8-10 years ago, where you could change the order of keywords in google search and get more precise matches. Could find pretty much any obscure thing on the internet.

Then could find that article that you remember read 6 months ago by adjusting the keywords until it is on the first page.

Now it does not matter at all what you enter in the search box. No matter the inputs you get one set of results and will never find something specific.

Yeah, and we should give it a cool name. Something that communicates that this is a new kind of search! I am thinking "NewHorizons" or how about "AltaVista"? What do you think?