| which again comes back to this is a problem which needs regulatory action, not one which should be solved by a quasi monopoly forcing it onto anyone but another quasi monopoly which can use their monopoly power to avoid it require - respecting robots.txt and similar - require purpose binding/separation (of the crawler agent, but also the retrieved data) similar to what GDPR does - require public agent purpose documentation and stable agent identities - disallow obfuscation of who is crawling what - do enforce it and sure making something illegal doesn't prevent anyone from being technically able to do it but now at lest large companies like Google have to decide weather they want to commit a crime, and the more they obfuscate that they are doing it the more there is prove it was done with a lot of bad faith, i.e. the higher judges can push punitive damages combine it with internet gateways like CF trying to provide technical enforcement and you might have a good solution but one quasi monopoly trying to force another to "comply" with their money making scheme (even if it's in the interest of the end user) smells a lot like you can have a winnable case against CF wrt. unfair market practices, monopoly power abuse etc... |
There is also nothing stopping other CDN/DNS providers spinning up a similar marketplace to what CF is looking to do now.