Hacker News new | ask | show | jobs
Ask HN: Is anyone tracking AI traffic to their site? Should we care?
2 points by ATechGuy 111 days ago
Lately we've been noticing a non-trivial amount of traffic in our logs that doesn't look like typical bots.

Not the usual noisy crawlers or obvious scrapers. The behavior is different with fewer hits, more selective page access.

Some of the user agents suggest AI crawlers, but some do not. How can we track these visitors?

2 comments

Yeah, those selective hits scream custom scrapers or AI data hunters. To track 'em:

- Parse logs: zcat access.log.* | awk '{print $1,$7}' | sort | uniq -c | sort -nr | head -20

Shows top IPs/paths. Whois suspicious ones.

- Add JS fingerprinting (canvas hashing, WebGL) to log real vs headless.

- Bait pages with unique content.

Set up alerts on anomalies. Caught some sneaky ones that way!

I'd start with the "why care?" question first.