Hacker News new | ask | show | jobs
by gnfargbl 2001 days ago
With any large-scale scan dataset like this, noise is inevitable. Legitimate use of HTTP/0.9 in consumer-facing web servers is exceptionally unlikely, but there are all sorts of scenarios which could have led HTTP/0.9 responses to bleed into the data.

For instance, here is an untested hypothesis: ~30 of the hostnames on the list have abandoned DNS A records, pointing to EC2 servers. Those EC2 IPs have since been repurposed as honeypots of some kind. The honeypots present themselves as HTTP/0.9, in order to look more like low-grade IoT devices.

That hypothesis is almost certainly wrong, but you could quickly invent another and at some point one of them will be correct. The internet is just a very messy place.