Hacker News new | ask | show | jobs
by lelandbatey 2306 days ago
In general it's quite easy to filter out bots and crawlers from your basic access logs, as most bots and crawlers will identify themselves as such.

If you're running anything with an API, then unless somethings horribly wrong it's even easier: look at the number of requests being made to an API endpoint and spot check a few of the user identifiers (tokens, keys, whatever you're using) to see the variety of users.

All of this is assuming you're trying to merely investigate the volume of use of a feature, not trying to diagnose demographics. If you're trying to extract more fine-grained detail, I don't have as many answers; I hope others will chime in with constructive ways to get things like geographic demographics via server logs.

2 comments

A very sizable portion of bot traffic does not identify itself as such. I don’t know if it’s a majority now, but it could be.
Many bots and crawlers are designed to be indistinguishable from humans.