Hacker News new | ask | show | jobs
by xg15 1611 days ago
> Many site owners find that the titles they carefully craft almost all get rewritten.

Yeah, I'm with Google on this one. I don't see many reasons why a site owner would spend extraordinary amounts of time to "carefully craft" page titles other than SEO and optimizing for clickbaitness. As a user, I'm fine with Google counteracting this.

16 comments

I think that is the worst reason for them to rewrite titles. If they left the title as-is, then I would be able to see in the search results that it was a spammy site and ignore it. Instead Google is helping to launder their SEO and present it as a more legitimate site. If Google thinks a site is gaming their algorithms they should de-prioritize it, not rewrite it.
I think you've got this wrong - they should heavily editorialize the titles.

Honest titles of search results:

* Five pages of flowery text and images before two lines of instruction on how to boil rice

* A bunch of tantalizing pictures of exactly what you're looking for but zero further information about it

* Product reviews machine-generated from public review sources with affiliate links. Top-rated product has the best affiliate revenue.

* You won't care about this solution to a problem you don't have.

etc...

Ah, rice. The quintessential Asian grain, now consumed by billions around the world. When I was a child, my mixed-race family used to eat rice every day! Even today, the subtle aroma of rice wafting up from the kitchen brings a sense of nostalgia. It's a sure sign that dinner is approaching, my favorite meal of the day...

[5 pages later]

1. Put rice and water in rice cooker

2. Press "start"

Isn't Google one major reason that recipe sites act like this? They've long favoured sites with a lot of textual content (which authors then break up with images) and also penalised sites that people tend to reverse out of quickly? A long story fits that because the majority of people need to read down for the content rather than get their instant answer and immediately retreat.

I find it annoying too, but it often feels like people ridicule the authors when they wouldn't get any traffic if it weren't for that approach. I don't think I've ever searched for a recipe and come across a barebones Gantt-chart-style engineer-thinking recipe plan.

It's not Google per se but a practical impossibility: it has to rank somehow, hopefully as a human knowing the answer would. They could theoretically hire humans to do it but they won't because it's obviously impossible due to how vast the dataset is, so they use software. Software is still far from human level reasoning so they use metrics. Metrics can and will be discovered and gamed, regardless of what kind they are.
As an AdSense user, if you don't use the maximum allowable number of ads (regardless of content length), Google literally emails you to suggest you add more ads. Their documentation encourages you to maintain a reasonable ratio of ads to content at risk of being shutdown, which pushes out page length. They push for unique content (so writers differentiate with personal stories), they measure time on page (longer details, pictures), etc.
> I don't think I've ever searched for a recipe and come across a barebones Gantt-chart-style engineer-thinking recipe plan.

https://clovegarden.com/recipes/index.html

Sorry, I might not have been clear enough. I know they exist. I'm saying that I've never searched for a recipe for something and a leading result has been in that sort of format. Google has created the environment in which the maligned 'epic story and photo album finished by actual recipe' formula wins through, yet the recipe creators get the ridicule.
Woah now, shouldn't step 1 be broken up into 2 steps. Each with their own heading and a paragraph explaining how to do that?
What kind of rice? Do you rinse the rice first? How much rice?

How much water? Do you salt the water?

Reminds me of Plain Old Recipe, a website that strips out fluff from big recipe websites. You provide a link to a recipe, it makes it to the point. I thought the site had closed but it's apparently still live!

https://plainoldrecipe.com/ https://news.ycombinator.com/item?id=23648864 (Thank you HN :))

and "rice cooker" is an affiliate link
Let’s also not forget the 55 auto-playing video ads that I need to vault over to get to Step 1. Each one determined to hijack my mouse as I scroll/hurry past and cause a click! It’s like the world’s least fun platformer game.
You forgot the part where there's a pseudo-recipe after the story that catches your eye but doesn't have any measured amounts, and then the actual recipe later.
I got instantly annoyed by the first few words of this comment, thinking you’d gone off on some tangent about rice… until I saw the last part. Well played!
This sounds like it would make a very entertaining Chrome extension.
Would also be nice if they edited things to actually be true, e.g.

* e-bike with 10 miles of actual range even though they advertise 30 miles

* laptop with 2 hours of battery life at 100% CPU usage even though they advertise 10 hours

* median $450 flight even though they advertise it as $199

> laptop with 2 hours of battery life at 100% CPU usage

Is there any laptop on the market that lives up to this. Even top specced MBPs I've gotten from work fall down when you actually use the CPU with compilers and VMs.

My simple M1 mpb 16gb seems to work for almost 2 hours when hammering the cpu. Haven’t timed it actually but I find it astonishing compared to the Dell mess I’ve had to deal with before.
Oh just an example. Hammer it at 100% CPU usage and report battery life based on that.

Or a (min,max) based on idle and 100% CPU.

You're never going to guarantee some kind of range on an e-bike. What's the temperature of the battery? Is it mostly uphill or down hill? How much are you going to brake?

And advertising laptop battery life based on the CPU getting pegged to 100% gives meaningless information as its rare for people to actually have their device running at 100% load anyways.

> You're never going to guarantee some kind of range on an e-bike. What's the temperature of the battery? Is it mostly uphill or down hill? How much are you going to brake?

Yeah but testing the e-bike on a track and telling the public it has 30 miles of range based on that is disingenuous.

Instead, go to a city with an average amount of hills, stop lights, and cold weather and give in a go, and tell that number to the public. If it beats that, in their actual city they'll only be pleasantly surprised. Right now you strand a shitton of people because they think they have 30 miles.

That depends on Google being both honest and accurate. Perhaps they have been so far, but my concern would be that a re-written title would cause quality content to get passed over by many viewers as undesirable/irrelevant because some algorithm misunderstood/misinterpreted what it was looking at, or because google wanted to subtly discourage people from content that competes or disagrees with whatever Google is attempting to promote.

In a better world, algorithms would be perfect and there would be a lot of healthy competition in search engines and google would be incentivized to provide users with the best possible results. In our current world Google's algorithm can't identify obvious spam well enough to keep it out of their results and there are no major search engines that haven't been lifting results from Google directly or indirectly and repackaging them as their own, so google has no pressure to do anything but promote whatever is in their own best interests or keep their results accurate and free of spam.

Imagine if your CLI tools did this.
"Gaming their algorithm" sounds like a fancy way of saying SEO. If Google can produce for me a more accurate (or concise) title, it should only help me find what I'm looking for.

Forcing folks to trudge through inaccurate titles – or hoping people know the tells of a "spammy site" title – does not seem a better alternative.

> "Gaming their algorithm" sounds like a fancy way of saying SEO

It's quite the opposite, "Search Engine Optimization" is the fancy euphemism for gaming the algorithm.

My favorite is when the title sounds like what you’re looking for only to discover it’s a page full of ads and keywords. The original title doesn’t even match.

That causes me to lose faith in google not a better experience.

If that actually happens, I'm surprised the article doesn't cover it. I've never experienced that.
I’ve found it most on the 2nd or 3rd page when googling specific but not common error messages.
I think what HN and the SWE community at large has just missed about Google over the last 10 years is that the product is being built for the masses. Most people would prefer if you just rewrote the title to what it actually was rather than having to take on the cognitive load of understanding what SEO even is.
AMEN to that
+1
>I don't see many reasons why a site owner would spend extraordinary amounts of time to "carefully craft" page titles

Because I want the title to be concise, but still help people explicitly understand what my writing is about? Because I've already spent a lot of time on the content, to then just slap 'Lou's Wednesday Website Update' as a title? Because, historically, a title is an introduction to my writing?

Any of those.

Regarding one of these examples:

  How to Fix a Broken iPhone Screen [Tested by Experts] - Phone Fixer
->

  How to Fix a Broken iPhone Screen - Phone Fixer
Tested by Experts is obviously clickbait; nobody's going to say [Tested by novices].

Same for things like [Updated 2022] - there are tons of websites that superimpose [updated <currentyear>] even if the article content wasn't updated.

If Google believes the site is being disingenuous by writing a click bait headline, then they should punish the site by decreasing their ranking, not reward it by keeping it high and rewriting a more fitting headline.
But if the title is spam, and the content is good (this is a big 'if'), the best solution would be to rewrite the title so that it's useful and keep the page at its original rank, based on the content. Ideally, Google would be able to handle all these different cases and just give me the best search results. Now, we all know that's increasingly less true, but in theory that's how it should work.
But “for 2022” is a guarantee that the content is bad if it hasn’t changed in 2022.

And yet, I don’t see how Google can automate checking this. It’s possible to add a couple of sentences about how you’ve not seen anything to change your mind about last year’s recommendations. That may well be true. Or false. How can Google know? It just sees content that has changed. So it has been updated in 2022.

The bigger issue is brand trust (as a reviewer brand). The NYT bought Wirecutter, I think, because it had established itself as a trustworthy brand. That’s in direct line with the reputation the NYT wants to have as a whole.

I hate how true your second paragraph is. Google should punish sites that change the date without updating the content, but all the SEO spam is just going to automate changing content when it changes the date. And then what does Google do? Figure out how to make an AI that can understand all the indexed content and accurately determine if it's truthful?

That seems fundamentally impossible without defining trusted sources. But then that means that you're trusting that Google's trusted sources are good. And if you do think they're good, then why not just check those sources directly?

The only answer I have is to find your own sources that you trust and go to them first.

But if the title is spam, and the content is good

Then the content would not need to be spam, to be high ranking.

Not if google just cared about content quality.

So in this scenario, where only quality counts for rankings, all a spammy title shows, is the desire to bypass legitimate rankings.

Thus, it should be downranked.

Again, this was if Google legitimately wanted to rank good content high.

I'm not convinced, in general I don't like this additional layer of "fiddling around" with the original contents.

What about the opposite, the title being great but the contents not really? Shall Google serve its own "improved"/"summarized"/whatever version?

Meh... - this reminds me of the snippets of text extracted by some websites that are sometimes shown directly in Google's results, which in my case were sometimes wrong because they didn't take into account the context of what was written in the original contents.

It should do both.
Wouldn't it be better for the users to penalize the sites ranking instead hiding the fact that the result is your usual click bait drivel? Rewriting the titles just hides that the results Google found are low quality garbage.
Maybe Google does both?
> Tested by Experts is obviously clickbait

If we're going to start filtering all "obvious clickbait" then the search results are going to change fairly dramatically...

> If we're going to start filtering all "obvious clickbait" then the search results are going to change fairly dramatically

Isn’t this the intended effect?

> Isn’t this the intended effect?

I hate clickbait as much as the next user, but using that technique to get users to click appears to have even become part of the core business model of previously prestigious outlets.

Picking on the WaPo for no real reason:

How the Washington Post pulled off the hardest trick in journalism https://www.cjr.org/public_editor/washington-post-fluff-news...

An Open Letter to the Washington Post: Please Stop Doing Clickbait https://thedailybanter.com/2016/05/letter-to-the-washington-...

As a subscriber to several newspapers, it's always interesting to see how different the headlines are between the dead tree editions, and the online versions — even for the same story.

The dead tree headlines are almost always very factual and to the point. I don't think I've ever seen anything close to something like "Here's four awesome tricks to get China to admit to the Tiananmen Square massacre" as a headline in actual print.

The easiest fix for clickbait would be to penalize them for it.
It would be a great feature if they tracked the date when the content actually changed... significantly. I guess that could still be gamed.
Not obviously. If true, adds credibility.

"Phone Fixer" sounds more scammy to me, lol

> Tested by Experts is obviously clickbait; nobody's going to say [Tested by novices].

Nobody would write [Tested by novices] into their headline, but leaving out the part in the brackets would leave it open if it was tested by experts or novices. So in this case the removed bit does provide some information.

>>> Because I want the title to be concise, but still help people explicitly understand what my writing is about?

And yet from TFA:

>> In fact, we found that matching your H1 to your title dropped typically dropped the degree of rewriting across the board, often dramatically.

Users don't look much at titles - they end up in the browser tab or somewhere like that. If a title doesn't match the H1 heading it's often to get more stuff in for SEO. OTOH short titles might be useful when they show up in a tab where there is limited space. Maybe they shouldn't lengthen them for that reason.

Can't say I agree.

Google should be a neutral middle man providing the results as they are found. If they feel the title is not of their version of quality they should rank it lower.

I'd prefer the version of title of several hundred million individuals rather than Google's aggregated version.

They used to 'borrow' DMOZ titles before DMOZ became defunct. At least in that case it's another point of view on top of their own (and the site author)

Google can't be a neutral middleman because everybody is trying to manipulate the search results. If everybody is clickbaiting their page titles, and Google just displays them as is, it makes their product worse.
The solution is not to re-title, the solution is to de-rank clickbait.
Well nowadays a lot of well known websites use clickbaits regularly, e.g., wsj and NYTimes. Many times, they are willing to summarize the news in the title when the news itself is not that complicated.
I'm sure they'd change to better headlines to avoid getting downranked.
That's assuming there aren't click bait false-positives based on page title.
You step away from neutral as soon as you introduce "version of quality". There will always be an introduction of bias and judgement calls that need to be made to get useful results, especially because bad actors on the web are part of the geography that aren't going away. Just like the press trying to force a neutral "view from nowhere" leads to confused and problematic journalism that can be exploited by bad actors.

https://pressthink.org/2010/11/the-view-from-nowhere-questio...

Indeed, quality/bias/judgement - I wouldn't argue about it wrt 'going away from neutral', I just meant that if a decision is to be made, either de-value it or show it in the top results, either way don't tinker with the information as it was laid out.
I agree in theory for SEO mills... but it can apparently go a bit overboard!

Concrete personal example:

- Title shown by Google: "Policymaking Beyond Corporate CEOs and Partisan Pressure"

- Original title: "Towards Platform Democracy: Policymaking Beyond Corporate CEOs and Partisan Pressure"

Rather large difference!

More details in another comment: https://news.ycombinator.com/item?id=30087485 , but search term is just "platform democracy" (2nd result)

For the same reason they extraordinary amounts of time to "carefully craft" the content of the page? And the images, and the citations, and the links, etc. For the sake of quality.
I think I see where you're coming from, but come to a different conclusion.

If you are, rightly, disappointed about low quality results in SERPs, then why not direct your frustration at Google's search algorithm? But ultimately once the algorithm has decided what to return, I don't want any of it to be tampered with. Maybe there's an argument that once you're using a black box, it might as well be the best black box it can be, but I don't agree.

I wonder whether there is a case for legal action here. Google would not have wasted time developing this rewrite engine unless it had an effect on clicks. Whether that is positive or negative, only they truly know. What if it was found that it was, or wasn't, being applied consistently to the results of their competitors, but not their own sites, for example?

Google isn't doing this for the user, they are doing it so Ads are more clickable than organic search, they want people clicking on Ads. I can guarantee they won't rewrite the clickbait ads written by marketers who are paying for space. The result is ads are more likely to be clicked

100% of the above the fold content is now ads on many search terms, Google is doing everything they can to squeeze more ad clicks, not provide the best information to their users

"How to growth hack your old website after reaching market saturation"
Some titles of the past before they were optimized for clickbaitiness:

Omelas, bye-bye (The Ones Who Walk Away from Omelas)

Things are looking up (Great Expectations)

A crying cop (Flow, my tears, the policeman said)

The one that got away (The Old Man and The Sea)

on edit: I expect someone will point out those are the names of works of literary fiction not webpages, but obviously if we assume that webpages do not deserve the kind of respect we would give a creative work in book form and not change the title because it suits our needs, then we should not spend all our time complaining that the content of the web is just lousy stuff that nobody would care if you changed with an algorithm.

As a user, I'm fine with Google counteracting this.

The problem there is that "optimizing for clickbaitness" means "making the titles as appealing to click on as possible when they're displayed in search pages". Google deliberately making them less appealing to click on means Google are reducing the effectiveness of organic search results, and that favors adverts instead.

In other words, what you are saying is that you believe it's valid for Google to rewrite website content to make search page adverts more appealing than the actual search results.

That is very hard to justify. If Google wanted to 'punish' sites for being too clickbaity then they should drop that site's position in the search rankings. Ranking it highly but rewriting the title to be something worse (or 'less clickbaity') is a massive abuse of their search market position to favor their ad business.

Specially when the article ends with:

> Want to optimize your titles for increased traffic?

> We built a title optimizer to take advantage of the outsized role titles play in SEO. Free to try.

Definitely SEO gaming.

If this were being done by a person, I might agree with you.

But it's not. It's being done by an algorithm which was carefully crafted to improve someone's chance for getting a promotion. It won't be maintained long term, yet it will continue to punish articles based on wholly arbitrary, biased, and opaque logic.

If this were being done by a person, I might agree with you. But it's not. It's being done by an algorithm which was carefully crafted [by a person] to improve someone's chance for getting a promotion.

I made a little change there. Algorithms don't just magically appear like leprechauns and unicorns.

Google search is one area of Google in which this big company problem actually doesn't happen that much. Changes in the algorithm are never implemented by fiat, Google employs raters, and performs blind experiments to test if a change to search actually improves user satisfaction before rolling it out to everyone. So at least they must have some data that it increases user satisfaction, both with the metrics of the signal they measure, and with subjective raters satisfaction.
I don't get this, why are you ok with bots changing your content, even if it's to be displayed on Google SERP?

Why stop at the titles?

I have an idea, let's have bots rewrite the content in a compact tl;dr format and have it be directly displayed on Google SERP, as user, the less actions I take the better right? You don't even need to leave the SERP.

Why can't I just choose what title I want in my blog to be indexed, and if Google wants to penalize it, so be it?

> let's have bots rewrite the content in a compact tl;dr format and have it be directly displayed on Google SERP, as user, the less actions I take the better right?

Fuck yeah! I’d pay monthly for a search engine that does this consistently. Google already does this for the articles that are easy to parse, but I’d love to see what newer methods based on language models can do.

Btw this article is talking about the <title> tag which is mostly used for SEO since users don’t see it on the page. I don’t think search engines have ever cared about it all that much.

And every site has different motivations.

How many times do we bicker about titles that make no sense / are deceptive on HN...

The whole situation is a mess.

"I'm fine with Google counteracting this."

The ministry of truth. Google shall own all truth.

They should de-rank clickbait websites, as many of them qualify as webspam.
>As a user, I'm fine with Google counteracting this.

Would you be fine with Google changing the work of all authors? Maybe "The Brothers Karamazov" doesn't get enough clicks and Google decides it needs a better title. Or "A Portrait of the Artist as a Young Man" doesn't quite convey what Google thinks it should...

How is that different?

To be fair, The Karamazov Brothers is arguably a more natural English translation.
It's perfectly cromulent English.
Should Google adjust it then?