Hacker News new | ask | show | jobs
by boracay 4181 days ago
Hm. I think the most important lesson here is that we need to treat "communication data"[0] more like we treat financial or medical data. If there isn't already there should be a rule in security that says that anything that's actively being used can't also be secure. They had year and year of data just lying around that people had mentally filed under "communication". It's kind of like web security where you lock down all your servers and then some developer leaks all the credentials on pastebin.

[0] There's probably a better word for this. A basically mean volatile data i.e. e-mail, working documents, logs etc.

2 comments

The industry term for this is unstructured data, basically anything that isn't kept in a database.

There's more you have to consider as well, you don't want to actually just archive anything older than a year. You want to set a rule that says: "archive anything created more than a year ago that hasn't been accessed in the last 3 months".

Further, there's all sorts of documents like the ones mentioned in the article that should be continuously monitored for and quarantined "passwords.txt" or Word docs with Social Security or Credit Card numbers in them.

Then you can get really sophisticated and start doing heuristic analysis of user behavior, setting alerts when Jim in accounting's account starts accessing marketing plans or when the account activity spikes beyond 5x what their regular usage is tracked at.

Full disclosure: the company I work for - http://www.varonis.com - makes software that does all of these kind of tasks.

Do you see any trend where companies want their data more structured from the beginning?
Easier said than done. I routinely pull up emails from a year and a half ago for reference, and it'd be a giant pain if I had to request access to some sort of secure archive for them.

Maybe it's necessary to move in that direction (and maybe emails stick around only if you've specifically flagged them?), but you're going to have to drag people kicking and screaming into that kind of system. Gmail search has spoiled us.

My proposition: Anything worth referencing later is worth filing properly.

That may mean transcribing instructions into a stand-alone checklist, writing up formal user stories, or the like. However, those acts also clear away a lot of cruft that can otherwise make it nigh-impossible to find the needed info. I've run many a Gmail search, only to find that a valuable email was buried under innumerable "Not quite what I wanted" ones.

For a lot of internal data it's having and using the appropriate systems in the first place. It's like VCS. You don't e-mail someone code anymore. You make a branch or whatever and then reference that in your communication.
You do need a good alternative systems of course. But once those old e-mails isn't around anymore you would have to use the other system to still have access to the data.

I'm not even sure current e-mail systems are such a good tool. I would think chat for internal things and some CRM type system (leveraging e-mail) would be better. But yes as you said, easier said than done.

What you describe is not "easier said than done" - automated email archiving is easily done.

What you describe is "I want less company security, more personal convenience". (As almost everyone wants, almost all the time).

It's not easier said than done because it's technically difficult, it's easier said than done because you have to shove it down users throats and they're not going to be happy about it. And because there's an associated cost to productivity.

Worth it to avoid something like Sony's indecent? Probably. Easy to convince the non-technical decisionmakers of that? I'd bet not.