|
|
|
|
|
by Jiocus
672 days ago
|
|
That's why it's possible to have a default deny rule in robots.txt User-agent: *
Disallow: /
And possibly allow-list the ones you accept. This probably won't change the fact that you may allow a vendor at one point in time, only to realise they changed their crawling use case and has been scraping data for AI training for the past 6 months (before they go public about it).It can be argued that if you are a server operator, you always know which User-agents are making requests to your resources. |
|