Hacker News new | ask | show | jobs
by mseebach 5927 days ago
Less than six months ago, some internal (non-confidential, non-critical, but, none the less, internal) documents of a client of mine showed up on Google. The reason? They were public files in a folder on the webserver, and someone turned on Indexes in Apache. It is the exact same problem.

Not even the shadow of a cloud (pun intended) was involved.

1 comments

I just had google index my ajax directory. I have a directory where I keep ajax files. The only link to them is through my javascript ajax calls.

I was pretty surprised that Google goes through your javascript, harvesting your ajax links.

Well, you linked to them via JavaScript. The whole rest of the Internet might not have been that careful, though.
Pardon my SEO: Google uses both heuristics and partial execution of Javascript these days. Linking to things only through JS is not a good method to prevent Googlebot from stumbling upon them. I only mention this because a lot of people I know think that apparently Google's colony of well-paid supergeniuses has not written anything since like 2004.
No, this is a brand new site, on it's first index through Google. I'm very confident that they went through the JS. Nobody else had any link to the site at all yet. Not really on topic, here, but the parents comment inspired me to share.