Hacker News new | ask | show | jobs
by Archelaos 1880 days ago
Using real person-related data for testing was made illegal in Europe by GDPR. Even with explicit consent of each affected person, the tester needs to make sure that the test really requires real person-related data, otherwise it would violate the principle of "privacy by design" from GDPR. Without consent it is entirely forbidden, as of my understanding of the GDPR.

Due to GDPR I had to implement an anonymization feature for the production database anyway. So I am able to clone the production database for testing and run the anonymization function on every entry in the test database before using it.

1 comments

This sounds like a needlessly strict interpretation of GDPR. Taken from the UK regulator's site:

> The lawful bases for processing are set out in Article 6 of the UK GDPR. At least one of these must apply whenever you process personal data:

> (a) Consent: the individual has given clear consent for you to process their personal data for a specific purpose.

> (b) Contract: the processing is necessary for a contract you have with the individual, or because they have asked you to take specific steps before entering into a contract.

> (f) Legitimate interests: the processing is necessary for your legitimate interests or the legitimate interests of a third party, unless there is a good reason to protect the individual’s personal data which overrides those legitimate interests. (This cannot apply if you are a public authority processing data to perform your official tasks.)

...

> Legitimate interests is the most flexible lawful basis for processing, but you cannot assume it will always be the most appropriate. It is likely to be most appropriate where you use people’s data in ways they would reasonably expect and which have a minimal privacy impact, or where there is a compelling justification for the processing.

> The processing must be necessary. If you can reasonably achieve the same result in another less intrusive way, legitimate interests will not apply.

> You must include details of your legitimate interests in your privacy information.

I included the legitimate interests bits because they seem most relevant to testing, but even if testing is not considered "necessary" in a particular use case, there still remain at least two more criteria that might satisfy the use of live data in testing, including explicit user consent. Much of the focus of GDPR is on privacy-invasive intrusive processing and prevention of harm, I think a lot of fuss around it can be dispelled when viewed from this angle.

I admit that using real person-related data in a test database for ordinary software development (not considering fixing a production system here) might be legitimate according GDPR based on contract and consent. But the hurds are so high that I cannot think of any realistic use case that involves more than a couple of people at the maximum. Contract and consent are subject to specific requirements here. The constent must typically be voluntarily, i.e. without any pressure or coercion, it must not be coupled to another condition and it must be given for each specific purpose indivdually. And I doubt that it is permitted to completely exlude the principle of "privacy by design" in a contract. And even if this were the case, you need this consent from each customer whose data is in the database, while you must not make your service to the customer dependant on his consent.

As to legitimate interest, Recital 47 of GDPR states: "The legitimate interests of a controller, including those of a controller to which the personal data may be disclosed, or of a third party, may provide a legal basis for processing, provided that the interests or the fundamental rights and freedoms of the data subject are not overriding, taking into consideration the reasonable expectations of data subjects based on their relationship with the controller. Such legitimate interest could exist for example where there is a relevant and appropriate relationship between the data subject and the controller in situations such as where the data subject is a client or in the service of the controller. At any rate the existence of a legitimate interest would need careful assessment including whether a data subject can reasonably expect at the time and in the context of the collection of the personal data that processing for that purpose may take place. The interests and fundamental rights of the data subject could in particular override the interest of the data controller where personal data are processed in circumstances where data subjects do not reasonably expect further processing."[1]

Typically one does not have a contract about software development with a customer whose personal data is stored in a database, but about a specific service, such as for example selling something to him via a Web-shop. So if Uncle Joe is buying something from the Web-shop, can he reasonably expect that his personal data is used in developing the Web-shop software? Most likely not. Ergo, there is no legitimate interest to use his data for that pupose.

[1] https://gdpr-info.eu/recitals/no-47/

[Edit for clarity.]