| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by EamonnMR 2777 days ago
	I wonder how GDPR will apply to models trained on user data. What does deleting user data look like in that case?

3 comments

abainbridge 2777 days ago

I work at Microsoft Research in the UK. A few weeks ago we had a lecture from a lawyer on exactly this subject. Her main point was that GDPR gives people the right to request their data be deleted but it gives companies the right to refuse if it would cause unreasonable damage to their business. Until a case makes its way through all levels of the court system, nobody knows how this collision of rights will be interpreted.

I suspect someone would have to show that the model trained on their data revealed something about them in a practically harmful way.

link

eiaoa 2777 days ago

> A few weeks ago we had a lecture from a lawyer on exactly this subject. Her main point was that GDPR gives people the right to request their data be deleted but it gives companies the right to refuse if it would cause unreasonable damage to their business.

I guess it still needs to be litigated, but the question on my mind is: Does that right of refusal only apply to the model, or also the data that trained it. If it applies to the data, the regulation is pretty useless, since anyone could avoid the deletion requirements by training models on it, if it doesn't I think the use in the model takes care of itself. At some point they'll need to retrain, and then you're data won't be there.

link

mattlondon 2777 days ago

It is an interesting thought.

I feel though that the point of the GDPR was to protect our personal data held by companoes, not to prevent companies using our personal data to make money.

So if a company uses your personal data to train a model (lets assume you willingly gave your informed consent for the time being), and then they delete your data after they have trained their model, does that model contain your personally identifiable inbformation? I'd argue that it does not - the model is just some weights, right? So 0.6 34.291, 0.0016 - is that you, mum?

.... but having just said that, I do wonder what happens if you run the model in reverse, like the deepdream stuff did (1). Could it re-generate PII (or rather generate "nearly-PII") purely from those weights?

1 - https://en.wikipedia.org/wiki/DeepDream

link

toomuchtodo 2777 days ago

I would assume deleting the model if you can't unwind specific user data used to evolve the model.

link

tjoff 2777 days ago

You only have to delete data that is personally identifiable.

link

toomuchtodo 2777 days ago

Can you prove it isn't personally identifiable?

link

ralmeida 2777 days ago

Can you prove it is? Many many things can be personally identifiable given enough resources and associated data, so it's unclear whose burden of proof it is, especially considering sibling comment mentioning a GDPR exception to delete data if it causes sizable damage to a business.

link