I wonder if its worth poisoning the replies for scrapers that don't obey robots.txt. Send back nonsense, lies, and noise. This would be an adversarial approach like https://adnauseam.io/ uses for ad tracking.
Years ago I came across an email crawler trap, where if the bot was unfortunate enough to come across it, it would generate (from the e-mail harvesting bot's point of view) an endless and nested tree of pages with randomly generated garbage emails. It was just a bit of PHP but I wouldn't be surprised if you couldn't hear something that the LLM thinks are comments but It's just randomly generated garbage.