Hacker News new | ask | show | jobs
by latch 5863 days ago
I extracted the data, added IP, country and response headers, and dumped it into a usable format:

http://openmymind.net/top1000data.txt

You can do some decently interesting analysis..like the fact that nginx is the front-end for nearly as many sites as IIS.

4 comments

Thanks for the data.

"nginx is the front-end for nearly as many sites as IIS", oh I wished that was true but according to my naive counting IIS is over 3 times more popular, am I missing something?:

$ wget http://openmymind.net/top1000data.txt

$ grep -c '"Server": "nginx"' top1000data.txt

39

$ grep -c '"Server": "Microsoft-IIS' top1000data.txt

149

Could you share how you extracted the data? I thought it was nifty that way you have it in that txt file. Thanks.
Why is dropbox 'myth and folklore'???
The categories of the json dump come directly from the original google source..so i don't know (and dropbox certainly isn't the only mis-categorized entry):

http://www.google.com/adplanner/static/top1000/

Oh god, and I've been trusting them w/ my backups!!!!!
Over what time period is pageviews calculated on?
The page views is pulled from the original google list. They have more information at: http://www.google.com/support/adplanner/bin/answer.py?hl=en&...

They don't say what time frame page view is, but they do say unique visits is over the course of 1 month - so its probably safe to assume page view is also over 1 month.