I see they say "if a domain or page is not crawlable by any search engine (it has a noindex tag), or if it is not crawlable by googlebot, then Brave Search’s bot will not crawl it either."
Does the Brave crawler send the Googlebot or regular Chrome User-Agent string? If it sends something different than the standard Googlebot User-Agent string, you could dynamically serve a robots.txt that blocks Googlebot to every client besides Googlebot. OTOH, I've read that the Google crawler sometimes users the regular Chrome User-Agent string and penalizes sites that return different content to Googlebot and Chrome.
What if I want googlebot to crawl it but not bravebot? Every other search engine lets me block its crawler specifically. Only Brave has this shady policy.
> What if I want googlebot to crawl it but not bravebot?
Then you need to gate your content such that it is not available openly to the public.
This falls inline with many objections to Google's WEI. If you host content openly and allow access freely, then don't be surprised when people access it at will and use it for free.
Then why does bravebot obey robots.txt at all? It does, and it will respect blocks of ggoglebot, but it won't allow blocking just it or just googlebot.
Or probably just an innocent oversight? I imagine they might have taken this decision early on when they were far too small for anybody to even think of not wanting to be crawled by them, and just never revisited the decision.
Youu want the monopolistic tech giant to crawl you but not a small privacy-focused company? What possible justification could you have for this attitude?
1: https://brave.com/search/api/