| I run a quite large website and there are a few patterns. The usage is extremely quick, and follows easy-to-spot patterns. We noticed a spike in bounce rate. They never come from Google, and the bad programmed ones just crawl several pages at a time, faster than a user could do. Then there's the crazy spikes in visits from specific countries, pretty much scraping the entire content. Often from pools of IPs. In some cases had 30% unexplained (meaning: it wasn't viral or a marketing campaign) random sustained increases in traffic. There's also the fact they don't interact with the complicated widgets, so zero XHR requests other than analytics pings. They also don't cause spikes in Google Analytics, so I assume it's blocked, but they show up in logs and in the internal analytics. It's not enough to DDOS the website at all, but it's a lot of noise in statistics that we gotta learn to filter. |
I’ve triggered this kind of “bot protection” right here on Hacker News many times. I did that by having a bunch of Hacker News pages open and then closing and reopening my browser. I’ve also triggered it by opening a bunch of links in the background too quickly. I’ve also triggered it by reading the article, then clicking back and upvoting/favouriting too quickly. I’m also located in Singapore, which people have started to advocate for blocking here recently.
A single non-bot legitimate user can easily trigger these kinds of heuristics just by using the site in a way you don’t expect. This can affect some users disproportionately more than others, e.g. disabled people who need to use assistive technology.