Hacker News new | ask | show | jobs
by faangguyindia 53 days ago
Yesterday I logged into cloudflare and found that Cloudflare had blocked chatgpt and claude from accessing my site. https://macrocodex.app

This is bad because there are fitness guides on my domain

https://macrocodex.app/guides which newbies often put in chatgpt and asks to simplify.

I enabled crawl for LLMs. There is lot of misinformation in fitness field so it's better if LLMs get their content from people who atleast have experience in the field

1 comments

It is good to make a proper distinction, in the ChatGPT context, between crawlers and agents. The crawlers go for the content to build a new model, the agents serve content to users. The last one can be very useful.
They use different user-agent strings. The crawlers obfuscate themselves and use residential proxies. The agents call themselves ChatGPT-User. Of course Cloudflare wants OpenAI to pay them for not blocking ChatGPT-User by default.
It's true, crawlers used for AI training don't say they are crawlers at all.