Hacker News new | ask | show | jobs
by senko 1617 days ago
From https://developers.google.com/search/docs/advanced/robots/in...

> [...] Googlebot and other respectable web crawlers obey the instructions in a robots.txt file [...]

If you're saying this is a lie, please provide sources

1 comments

On an eCommerce site I'm responsible for I changed some links from a GET to a POST. "BingPreview" continued hitting those links with GET requests, polluting my logs with 100s of "method not allowed" entries. So I blocked that UA from those links, nothing changed. Banned the bot all together, still hitting my site. This went on for well over a year.
I believe BingPreview is acting like a regular user - so it is not behaving like a Robot but like a user.
What does that mean exactly? An actual user can't be involved because the links that trigger a GET simply aren't there anymore. Therefore I assume it's a bot hitting faulty links it finds in its cache.