|
|
|
|
|
by chlorion
149 days ago
|
|
I self host a small static website and a cgit instance on an e2-micro VPS from Google Cloud, and I have got around 8.5 million requests combined from openai and claude over around 160 days. They just infinitely crawl the cgit pages forever unless I block them! (1) root@gentoo-server ~ # egrep 'openai|claude' -c /var/log/lighttpd/access.log
8537094
So I have lighttpd setup to match "claude|openai" in the user agent string and return a 403 if it matches, and a nftables firewall seutp to rate limit spammers, and this seems to help a lot. |
|
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36", anyone?