Hacker News new | ask | show | jobs
by sgrenfro 4731 days ago
I work on this at Facebook and we do permanently delete your content when you delete your account. It's an interesting distributed systems problem, and we're happy with the framework we've developed for this. We're working on a blog post with more details and hope to publish that soon.

Also, I mentioned why account deletion is a non-trivial problem in this comment thread last week: https://news.ycombinator.com/item?id=5976947.

2 comments

Just to vouch, though not in Facebook's case, it is complicated.

Many sites use a CDN (Akamai, Limelight, Amazon's Cloudfront, etc.). The whole idea of a CDN is that it distributes content. Even if the origin goes away (your copy), the CDN may continue serving it for a long time. If someone has a specific item URL within that network, they can still access it. Working with CDN APIs to delete content (especially if, say, that content has various instances based on sizes, previews, etc.,) can be ... interesting.

And if third parties are presenting your content, they might also persist it, say as a Google preview or cache, or Archive.org, or other tools.

Even within your own systems, data can be replicated in ways which are difficult to access fully. Backups can exist which cannot be easily accessed for wiping. There are war stories of magically re-appearing data resulting from data recovery operations.

So, while it's possible to flag content as "don't present" pretty easily, actually rooting all of it out thoroughly can be a much more involved task.

Un-seeing is difficult.

Facebook uses CDNs; how is it not complicated in the way you describe for them too?
I phrased that poorly: I'm vouching for the general case, not for Facebook specifically. I've not worked for them or on their systems.
You really expect us to believe this, that data would permanently be deleted, coming from facebook?

You know, THIS facebook: http://pleasedeletefacebook.com (list of most of their history up through 2012)