Hacker News new | ask | show | jobs
by JimDabell 362 days ago
This actually blocks a lot more than just AI crawlers. You shouldn’t use this without reviewing it in detail so that you understand what you are actually blocking.

For instance, it includes ChatGPT-User. This is not a crawler. This is used when a ChatGPT user pastes a link in and asks ChatGPT about the contents of the page.

One of the entries is facebookexternalhit. When you share a link on Facebook, Threads, WhatsApp, etc., this is the user-agent Meta uses to fetch the OpenGraph metadata to display things like the title and thumbnail.

Skimming through the list, I see a bunch of things like this. Not every non-browser fetch is an AI crawler!