|
|
|
|
|
by simondotau
212 days ago
|
|
My base advice is to make sure you have a very efficient code path for login pages. 10 pages per second is nothing if you don’t have to perform any database queries (because you don’t have any authentication token to validate). Beyond that, look for how the bots are finding new URLs to probe, and don’t give them access to those lists/indexes. In particular, don’t forget about site maps. I use cloudflare rules to restrict my site map to known bots only. |
|
They discovered those URLs simply by parsing pages that contain like buttons. Those do have rel="nofollow" on them, and the URL pattern is disallowed in robots.txt, but I'd be surprised it that'd stop someone who uses thousands of IPs to proxy their requests. I don't have a site map.