I would reconsider this approach that you have. I run web crawlers for fun and profit and there are several pages that you can assume a website might have, like robots.txt, humans.txt, sitemap.xml, sitemaps/sitemap-index.xml, blog.example.com, blog, etc.
If someone is trying to get to admin.php, sure, ban them. Or if they are not following robots.txt. But sitemaps are not reliable enough sometimes and not all crawlers are meanies.
You don't know that. He could be running a crawler that builds a service that ends up sending you very valuable traffic... Or you could be right... Not really enough info though.
There is plenty of info, namely that the service doesn't exist now, and that we can recognize and allow its crawler if it ever becomes successful. It just doesn't make sense to allow every tom, dick and harry's crawler on the promise of some future benefit. That's like delivering and opening every spam e-mail in case there is a nugget in there.
The idea of saying yes to everything is comically explored by Jim Carey in the film:
robots.txt is widely ignored. User-agent fields are faked out to make robots look like Firefox on Windows.
Anyone can make a crawler and then have it report as Googlebot. That doesn't even violate the robots.txt; it says, if your name is Googlebot, you're allowed.
Blocking crap requires cunning: code that looks for suspicious access patterns and responds.
A genuine Googlebot should be operating from a Google domain. If we reverse the client IP of a Googlebot request, we get something in the ".googlebot.com" domain.
Really depends on the site from what I've seen on large sites that went through 10 years or so of revisions and redesigns that vast majority of 404's come from broken links in the site it self.
Heck when MSFT redesigned technet/MSDN half of the links were dead allot of the Google results for legacy products will still lead you to a 404 page on technet...
If someone is trying to get to admin.php, sure, ban them. Or if they are not following robots.txt. But sitemaps are not reliable enough sometimes and not all crawlers are meanies.