I'd rather have a site showing how well my site is protected from being accessed by AI agents would be preferable, and advises how I can lock it down further. Basically, the exact opposite of this.
Maybe we can start a new protocol where the html is encrypted, and the viewer must try 2^10 to 2^20 hashes before the decryption key is discovered. Same formula that BTC mining uses. It would be negligible cost for any single user but terribly expensive for crawling en-masse.
Anything that increases the entry time by a second or more is a pretty good way to make me (and probably others) just not bother with opening the website.
Usually the Anubis anti-bot things only take a second. But I stared at one for more than 30 seconds the other day when I tried to access one of the Linux kernel websites. Literally just a progress bar with a hash counter. I was on a modern iPhone, I don’t know why it took so long. maybe because my phone had low battery? But it’s infuriating that this is what the web has become.
The web is becoming more and more unusable every day. If your data is easy to access, it gets stolen and scraped, your site effectively DDOSed. If your site is hard to access nobody will visit.
This is just introducing a small business cost for AI/scrapers and a reason to bail out of the funnel for real users--so by charging, you'll have an even larger percentage of bots.
Folks added an optional field to store a broad age category to optionally present to websites to facilitate keeping children out of porn sites, and everyone lost their collective minds.
One could consider that the LLM paradox: If you don't want an LLM talking about how to make a nuclear weapon, you first need to explain to them how to make a nuclear weapon, which increases the likelyhood, despite your admonition, that they would talk about it.
So perhaps you can point your LLM at this and ask it to inverse the rules and make sure user design remains consistent.
Last night I had a nightmare about cloudflare finally monetizing the "making sure you're not a robot" page. AI agents got the information they needed, we got ads instead ("why are you here? You're supposed to let agents do the thing. Watch some ads instead").
I dream of the day where we have the opposite. Each website you visit/scrape/your bot interacts with asks you for $0.01 as payment in lightning tokens. You pay per visit and you don't have to see ads or be tracked anymore.
Bot could look at remaining balance and decide which sites to visit. Ah, <popular resource> has raised rates to 0.025 microtokens/access, I'll have to use <secondary resource> which is still a budget-friendly 0.005 mt.
If you depend entirely on search engines for sure. I do not have a commercial site but if I did I would pay other popular and related sites to link to me in a classy non spammy way. I would also pay influencers to link to me and talk about my site.
I could totally imagine Joe Rogan saying, "Hey Jamie, what was that site? Oh yeah go to ai dash sucks dash bfdd dot newsdump dot org to get your copy of an SSH banner today."
I've had traffic sent to me long ago from paying into Google's program but it was mostly bots. This was in the 2003-2009 time-frame. I imagine by now it's not much better.