Hacker News new | ask | show | jobs
by BrandonMTurner 5266 days ago
You really shouldn't be puzzled at this. It is actually pretty easy to explain if you have ever worked on a large scale website before. The problem is you begin to accrue large amounts of data and metadata (data about your data). And just "getting rid" of that data is actually hard at scale for a few reasons:

1) Lets say I post on your wall and then I delete my account. Does that mean the message should be removed from your wall? What if you really liked the conversation in the comments that took place after I posted the comment, you are just out of luck? This gets trickier and tricker to handle these types of problems as things like groups, forums, and tagging get added to the social network feature set. All of a sudden is very confusing and unclear what exactly should happen with this type of data. Let's say you do keep that message, when how much of the deleted account is required to keep alive to maintain your database relations (this assumes you are using a normalized relational database to manage your site).

2) The site I work on gets a lot of data from users. It isn't uncommon to have ~5MB from a single user in our database. The actual delete operation on that tables is really rough. If 4 users all tried to delete their account at the same time doing a straight DELETE on the tables would be horrible. Not to mention it leaves holes in your tables in some cases.

3) Is it actually legal to delete the data? Can Apple just delete an account where charges can be placed? I would think they need to keep a history of who used what credit card and so on. I am guessing that medical records and emails for large companies have some kind of restrictions about data retention.

4) The backup issue. If a user deletes their account, does the user expect that the company also goes through all their backups and delete their information from there as well.

All of these things add up to a pretty big burden pretty quick and I think it is logical to see why companies might choose to not allow people to delete their own data. I can also understand why people disagree with that decision, but it really shouldn't be puzzling.

2 comments

It should be just as puzzling as any other security / privacy issue. No more, no less.

If a website doesn't take security and privacy as high priority concerns, then they don't want me as a user. I hope to educate more people to feel the same as I do.

They could at least disable logins for it and block any further emails from going out.