Hacker News new | ask | show | jobs
by sndman 1589 days ago
I lobbied heavily to crawl my company's intranet since our official IT folks would not include the ~10k "grey" (as in unofficial) web servers set up by technical people over the years.

So I tried this myself and quickly realized how hard it was. I stumbled upon thousands of devices that had TCP port 80 (and 443) open so I had to devise various ways of removing these devices.

By the end of my project, I had run out of disk space so many times it was laughable. And tuning the crawling and the resulting mountain of data was daunting and started to affect my day job so I eventually gave up.

A couple of months after my "project", and with enough warnings from IT and our network security folks, our company decided to purchase a couple google 1U servers.