Hacker News new | ask | show | jobs
by dgreensp 311 days ago
Googlebot respects robots.txt. And Google doesn't use the fetched data from users of Chrome to supplement their search index (as a2128 is speculating that Perplexity might do when they fetch pages on the user's behalf).
1 comments

Yes, but there's no way to say "allow indexing for search, but not for AI use", right?
But there is: https://developers.google.com/search/docs/crawling-indexing/...

There is an user agent for search that you can control in robots.txt.

    user-agent: Googlebot
There is another user agent for AI training.

    user-agent: Google-Extended
Wow, I had no idea this page existed, thanks for the reference!