Hacker News new | ask | show | jobs
by dazc 190 days ago
Not sure how search result pages can be crawled unless they are cached somewhere?
2 comments

If I'm reading correctly, it's not that your search results would be crawled, it's that if you created a link to www.theirwebsite.com/search/?q=yourspamlinkhere.com or otherwise submitted that link to google for crawling, then the google crawler makes the same search and sees the spam link prominently displayed.
Yikes.

What could Google do to mitigate?

You noindex search pages or anything user generated, it's really that simple
Not enough. According to this article (https://www.dr.dk/nyheder/penge/pludselig-dukkede-nyhed-op-d... you probably need to translate) its enough to link to an authorative site that accepts a query parameter. Googles AI picks up the query parameter as a fact. The artile is about a danish compay probably circumventing sanctions and how russian actors manipulate that fact and turn it around via Google AI
Yeah all pages should have a proper canonical which would solve this too
In this case, all i had to do was let the crawler know not to index the search page. I used the robots noindex meta tag on the search page.
I don't know what you mean by cache but you aren't using it correctly...