Hacker News new | ask | show | jobs
by microtonal 51 days ago
I also block all AI crawlers. I am not sure why I should give them my content for them to rip it off and make money from it through training or agents. Sadly, a lot of AI companies are trying to make requests indistinguishable from regular browsers from residential connections, so unfortunately I have to use Cloudflare to block them.

Ideally I'd make the content available to crawlers for training open models, but that seems to be nearly impossible. It would be possible if other AI companies behaved.

1 comments

>so unfortunately I have to use Cloudflare to block them.

That can’t block Grok, can it?

(You might have a fake iPhone or something visit your site if you ask Grok to retrieve information from it)

What's the IP address of the supposed iPhone? Does it come from T-Mobile or from xAI?
Residential I thought? It might’ve been even someone on here who posted about watching their server logs while they messaged Grok themselves.

Curious if xAI has a phone farm. Maybe just running simulators on servers?

Residential proxies are a commodity at this point. You can also run your own network and try to get it misclassified as residential.