Hacker News new | ask | show | jobs
by JonoW 5332 days ago
As @maw said, most top sites remove the http headers that identify the web-server. Also, if you had to take the top, I don't know, 10k sites by traffic, I would imagine all of them use a load-balancer, so how can netcraft know what the web-server is behind them?
1 comments

Top sites? I think probably a few of these would quality:

CNN.com — Server: nginx

Netflix.com — Server: Apache-Coyote/1.1

YouTube.com — Server: Apache

Wikipedia.com — Server: Apache

Twitter.com — Server: tfe

LinkedIn.com — Server: Apache-Coyote/1.1

Granted, this does not give you a good picture of their network topology, but in general they do report something that Netcraft can use.

Right. The version is often stripped out, but the server software tends to stay in there.

Totally possible there's extra heuristics in there though; I totally can't remember from when I ran that thing but it took about a week to run and spent most of its time waiting on other people's servers, so burning a bunch more CPU doing extra analysis wouldn't cause Netcraft any issue at all, I don't think.