No, what scales is us making our DDoS and bot detection not disrupt the crawling of legit search engines that respect robots.txt, don't crawl at ridiculous speeds, don't do dumb stuff like pretend they are the Googlebot. We have teams who work on that. You can read more here: https://blog.cloudflare.com/tag/bots/
But let's suppose someone is building a new cool search engine and our ML stuff is blocking them. Then... contact us/me.
So for my startup to crawl sites I must now adhere to Cloudflare’s Requirements of the Web(TM) or reach out to individual engineer, who may leave at any moment. Gotcha
(but Google is allowed because Google was first to market)
Why would you possibly think you can do whatever you want to someone else's site?
Yes, you must adhere to the controls that site administrators put in place, like Cloudflare.... You don't get to blast my site with requests, just because you want to...
I can't speak for Cloudflare, but crawling speed should be dictated by the site owner via the robots.txt crawl-delay. [1] A site owner could also rate-limit unauthenticated requests by IP via the cloudflare header using a 429 too many requests error page.