Hacker News new | ask | show | jobs
by viraptor 1346 days ago
I'd go against the "just increase the cf strictness" advice. It's counting on cf basically doing something magic and hoping to not about real users - and that's not really possible.

1. Why do you want to stop bots? Are they actually overloading your resources, or are they just noisy in the logs. If you can easily handle the traffic, maybe find a way to filter the logs better.

2. How do you know they're bots? If they're easy to identify, can you write a few simple rules to remove most of them?

2a. Are they mindless scans? Make sure your app doesn't even see requests to resources which don't exist.

2b. Are they scraping content? Set up per-resource-per-IP rate limits (token bucket style)

2c. Are they coming from a specific network, for example tor, AWS, or similar? Put in an auto updating list of sources that get dropped at firewall level.

3. As mentioned in other comments, if you're using some proxy in front of your service, ensure you drop any traffic which bypasses is.

Basically consider what's actually happening and respond to that. There's no setting that will improve things without side effects, or it would be already turned on.

1 comments

1. Bots are more than half of my traffic right now and they don't provide any benefits except using my bandwidth and distort my statistics. 2. Strange behavior + technology 2a-c. Could be a part of a solution. 3. Good idea

The purpose of this thread is to gather ideas and experiences. A silver bullet would be great, but since it's not realistic, all ideas are more than welcome.