Hacker News new | ask | show | jobs
by polack 3188 days ago
What do you mean with "auditable way of wiping data"? Just that there will be a log that the data was wiped, but the actual data is gone forever?

The reason I ask is that all "Big Four" auditors has been on my company that we need to be able to wipe customer data, but at the same time there are other laws saying we must keep a record of all data (financial) for many years. None of them can say what law will rule over the other one though since they are not compatable...

4 comments

You'll need to be able to delete certain customer data in response to a valid request. To do so, you need to be able to find and review all such data, not just in databases, but also in unstructured and semi-structured forms such as file shares, SharePoint and email, and even paper files if they're in a filing system.

You also won't be able to keep backups of this data longer than is necessary for operational restore purposes (more on that below).

The rule is that you shouldn’t keep personal data for longer than is necessary for the purpose for which it was collected.

There are five exceptions to this, one of which is:

2) for compliance with a legal obligation which requires processing by Union or Member State law to which the controller is subject or for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.

This addresses the need to meet other regulatory requirements that you mentioned.

You'll need to keep a metadata record of what you have deleted.

In the event that you have to restore data from a backup for operational purposes, you need to cross reference it to the record of deletions that occurred since the backup was created to ensure that any such data is either not restored, or is immediately deleted again.

This is only a fraction of an organization's obligations under GDPR, being those most directly relevant to your question.

Disclosure: I work for a company that provide solutions in this space.

Pretty much exactly this and clearly your knowledge is greater than mine.

I'm still finding everywhere we store data and fixing as much security stuff as fast I can (some of it I'm not sure programmers on here would believe).

It's a gargantuan task.

Which company do you work for if you don't mind me asking? (If you do no worries :) )

Can you explain in more detail to how the GDPR applies to unstructured forms? Would those be forms specifically for inputing personal data, or any free text at all?
Any personal data is subject, whether it is contained in Word documents, PowerPoints (that could be image based scans that will need to be OCRd to make them discoverable), spreadsheets, text files, database dumps, PST files, CSV files, etc, etc.

If it contains personal data on an EU natural person regardless of where the company is based is based, or on any natural person anywhere if the company is EU based, it is subject to the GDPR.

My question is more, what if you don't know it has personal data? Say you're just a generic document storage & sharing service, and someone uploads a generic PDF or Word, but which happens to contain personal data. Surely you're not expect to treat any possible data you receive as personal, just in case?
If you're providing a consumer storage service, and users are uploading their own data for personal use, this is outside the remit of GDPR.

If you're providing a storage service to a business that handles personal data, your a data processor, not a data controller.

If you're the data controller, you need a classification technology that can identify personal data in those documents (amongst other capabilities).

As always, there are exceptions, but that's the general rule.

Thanks!
Our understanding (I work at an agency) is that you must keep the data that you is part of a contractual agreement. Purchase histories are a typical example of data you may not wipe.
You have to separate between what layers call Lex specialis and Lex generalis. The former one is usually national accounting laws, archiving laws, banking laws etc. that apply to certain organisations, industries etc. These laws take precedence over Lex generalis which are general laws that apply where there are not any special laws. GDPR the latter, is a general law...
In our case it will likely mean that we have a defined documented procedure in place to remove the customers data within the specified period.

In terms of technical implementation it'll be a bastard (or result in us holding backups for a shorter period), dumping your DB backups will mean that you still have the data outside of the period (for a lot of places).

It's going to be interesting.

It's not just that you can no longer hold backups for an extended period as a form of pseudo archive, but that for those backups you do keep for operational restore purposes, you have to ensure that data that was deleted or redacted under the GDPR right to erase is not subsequently restored during a routine recovery, or is immediately deleted / redacted after the data set is recovered.

This (slightly ironically) will require keeping a record of what data has been deleted from production systems in response to "right to erasure" requests.