Hacker News new | ask | show | jobs
by Raed667 336 days ago
Was it ever publicly communicated how 12ft or archive.ph|is work? Or is it something they keep to themselves ?
1 comments

I think (in the case of 12ft) they were just impersonating Googlebot.
That's surprising because Googlebot publishes IP ranges for its crawlers and it's fairly simple to block fake crawlers these days (super easy through Cloudflare, for example).
doesn't google also run some "undercover" bots to verify that you don't serve very different versions of your website to users vs bots ?
In my experience 12ft.io was pretty much useless after a honeymoon period of a few months when it first came out so I wouldn't be surprised. The Googlebot method used to work with almost everything but at some point major news orgs caught on in quick succession and I gave up even bothering to try it.
this shouldn't work due to reverse DNS checks
> they were just impersonating Googlebot.

Which is something that shouldn't work. Google used to require sites to show the same thing to Googlebot and normal users; cloaking used to be banned. Were Google still enforcing that rule, these sites would have been removed from its index.