As @maw said, most top sites remove the http headers that identify the web-server. Also, if you had to take the top, I don't know, 10k sites by traffic, I would imagine all of them use a load-balancer, so how can netcraft know what the web-server is behind them?
Right. The version is often stripped out, but the server software tends to stay in there.
Totally possible there's extra heuristics in there though; I totally can't remember from when I ran that thing but it took about a week to run and spent most of its time waiting on other people's servers, so burning a bunch more CPU doing extra analysis wouldn't cause Netcraft any issue at all, I don't think.