Hacker News new | ask | show | jobs
by immibis 230 days ago
It's interesting to study, right? This is the Internet equivalent of background radiation. Harmless in most cases. Exploit scanners aren't new to the LLM age and shouldn't overload your server - unless you're vulnerable to the exploit.

Fun fact: Some people learn about new exploits by watching their incoming requests.

1 comments

> It's interesting to study, right?

Definitely! I wasn't experiencing any issues, hell it wasn't even for public consumption at that time so no great loss to me but I found a few things fascinating (and somewhat stupid!) about it:

1. The sheer number of automated requests to scrape my content

2. That a massive number of the bots openly had "bot" or some derivative in the user agent and they were accessing a page I'd explicitly denied! :D

3. That an equally large number were faking their user agents to look like regular users and still hitting a page that a regular user couldn't possibly ever hit!

Something I did notice but it was towards the end and I didn't pursue it (I should log it better the next time for analysis!) was that the endpoint was dynamically generated and only existed in the robots.txt for a short time but there were bots I caught later on, long after that auto-generated page was created (and after the IP was banned) that still went for that same page: clearly the same entities!

My spidey senses are tingling. Next time, I'm going to log the shit out of these requests and publish as much as I can for others to analyse and dissect... might be interesting.