Hacker News new | ask | show | jobs
by tbrowbdidnso 3391 days ago
I'm no wizard, but in writing blog articles I've found ways to fool Google into believing me.

The problem with their algorithms.... is that all that statistics in the world can't help you when you're listening to a guy telling the truth vs an equally good liar.

I can tell you what Google doesn't have, a strong AI. It thinks it knows "facts" but these are merely patterns, and these can be gamed.

Because Google still lacks a strong, truly thinking AI, they rely extremely heavily on statistical models to rate content.

So how do you cheat google search?

Google's systems attempt to figure out the topic of your writing, the style, and quality. Is it scientific? An opinion piece? News? Is it a technical topic? A playful one? Fiction or nonfiction ?

The quality classifiers are much easier to game than topic and style analysis. They determine things like reading level of text but also things like the number of rare nouns, number of technical words, number of typos. Readability as far as font and formatting. Trustworthy signal of your domain and possibly the company and people they determine to be linked with it.

I also have a feeling google uses sneakier signals as well. These include your DNS registrar, phone number, email, and address listed on the site. Who you host with and what technologies you're using. Your mail servers and how trustworthy they are. Geo location, and visitor traffic info as soon as you put analytics on the site (or use amp)

Basically when Google says they have tons of signals, they do. They have a dataset that amounts to every site on the internet for the past 15 years, and they regularly run automated and manual "theory provers" much like quants do with historical stock market data. They find new signals constantly, and run tests to see if their new algorithms are better.

You know how sometimes google randomly takes a bit longer to load search results? My tin foily theory? They'll occasionally guinea pig you on prototype search results to see if they're better. I noticed their response time getting really bad a couple months before the public rollout of new AI powered search for example.

So gaming google? Do exactly everything that a large, legitimate, no-nonsense company would do. From where you host, what you host with, to who you link to. Bonus points if you have significant real looking mail and other traffic from your domain. Extra bonus points if you actually sell something real as cover and do it for at least a few years.

Once you've done enough convince google you're a big important thing IRL...write a ton of really subversive bullshit. Make it sound a real as possible, hell make 90% of it real, just with a single unverifiable fact. Keep pumping this shit out and make sure your garbage is never fake enough to get called out on. Or just make the fake part so hard to verify that nobody will waste the time, kinda like half the science world does when publishing papers.

3 comments

Gaming Googles image search is a popular activity on some subreddits. Basically telling people to upvote a picture of something with a misleading title. Like if you search for images of "gaming console", a picture of a potato will appear. It is very clear that their "AI" is not that clever yet.
> The problem with their algorithms.... is that all that statistics in the world can't help you when you're listening to a guy telling the truth vs an equally good liar.

While that's true for sites in isolation, Google published a paper a few years ago[0] that describes how you could estimate the trustworthiness of a website. The basic idea is you assign a trustworthiness score to each website. Then, you determine how likely a fact is to be true based on the trustworthiness of the sites that state that fact. You can then recalculate the trustworthiness of each site based on whether it agreed with the fact or not.

[0] https://arxiv.org/pdf/1502.03519v1.pdf

What we seem to be running into, is that any strategy based on trusting some sites and not others breaks when very large groups of sites have opposite opinions. Depending on your starting set you will end up with wildly diverging trust scores
> I can tell you what Google doesn't have, a strong AI

Did anybody think the contrary?

If anyone had one it would be Google :)
Fact is, nobody is even near having one ;)
Fact is no one wants a strong AI, skynet leading to matrix future is not that desireable. On the other hand soft AIs are everywhere and replacing humans in many jobs.
> Fact is no one wants a strong AI, skynet leading to matrix future is not that desireable.

I want a strong AI, because I'm more a fan of the depictions along the lines of Iain Banks' post-scarcity Culture Minds.