Hacker News new | ask | show | jobs
by tyingq 1786 days ago
Canonical urls help with noting your own purposeful duplicated content. But that meta tag goes on the duplicated content. So it doesn't help with scrapers, who strip that out.
1 comments

But I thought that it was useful for google - who could find two caches with same content, one of which was 2018 one of which 2020 and both say "this is canonical". At that point the 2018 version is real and the other rejected.

Then again, you could just do it with publication dates ...

I don't know why, but Google seems unable to figure out (or just doesn't care) "who published it first". I've seen it be confused many times.