Hacker News new | ask | show | jobs
by Silhouette 1616 days ago
I have a small business that operates a website. To be absolutely clear, we don't rent or sell personal data in any form and our business model has nothing to do with tracking or profiling anyone.

We retain certain access records that can potentially be used to identify individuals indefinitely. These records have demonstrably helped us to defend against attacks on our infrastructure and to prevent attempted fraud on multiple occasions, sometimes years after the records used were first collected. We include these general purposes for processing but do not disclose exactly how we use these records for these purposes in our privacy policy.

So, are we compliant because there is a demonstrable legitimate interest in keeping these records? Is holding that personal data indefinitely, knowing that it mostly won't be needed, disproportionate and a GDPR violation? I'd love the people who think the GDPR is simple to show me verifiable, authoritative answers to these types of questions, because so far we haven't found any lawyer who can, nor found any information from any relevant regulator that we could point to as a clear indication either way.

4 comments

1. You'll have to disclose this in your privacy policy

2. You can store identifying data of website accesses etc for at most 30 days without worry

3. Beyond that, you can only store data that's absolutely necessary, e.g. metadata associated with actual purchases and transactions, but not every access.

4. Usually, you'll have to delete that 2 years afterwards, in some exceptional situations up to 30 years are possible

What I'd do: 1) disclose, 2) delete logs after 29 days, 3) copy all logs associated with a customers transaction into a separate storage location, shared by customer, transaction and date, so you can delete it 2 years later.

My response to all of your points is the same: can you cite the authority for those claims please?

For example, no-one processing card payments is going to disclose in any privacy policy exactly how they combine all their signals to determine fraud risk and whether to allow an attempted transaction in real time.

If you're really in business in the EU or associated companies, and not just LARPing in a comment section, you'll have been under these laws for several years now, and should already have contacted your own lawyer on this question.
I've commented about my business interests and the GDPR several times in my more than a decade on HN. You're welcome to scan my comment history if you think I'm LARPing but I have no interest in continuing a discussion with anyone who isn't doing so in good faith.

I addressed your point about taking expert advice in my original comment above: neither lawyers nor regulators have been able to give us a clear answer so far.

I'd believe that you're describing your system's current requirements at a high level. But without exact technical details of how retaining personal information helps you prevent fraud many years later, I don't believe that it is the only way possible.

For example, if the personal information you're talking about is IP addresses, it seems like you could cook those down to non-identifying information pretty quickly - eg zap the last octet. Furthermore, I'd think you would want to cook it down promptly so you can store the current use of the IP block rather than what it might be used for in a few years. (Sidenote: I personally get hassled based on my IP address block way too much, so keep in mind you're harming legitimate customers if this is what you're doing).

Another example - if you're keeping personal details on people who have committed fraud (or not) and referencing that years later, then I'd say that falls squarely in the purpose of the GDPR and you should not be doing that long term.

Or you're doing something else. But without describing exactly what you're doing, you don't make a very compelling case.

Another example - if you're keeping personal details on people who have committed fraud (or not) and referencing that years later, then I'd say that falls squarely in the purpose of the GDPR and you should not be doing that long term.

You're saying that we shouldn't be keeping detailed records of previous attempts to criminally defraud us that are demonstrably useful for identifying and preventing further attempts to criminally defraud us over a long period of time by the same groups of people?

I'm sorry to be blunt but that is not a serious proposition. If anyone thinks the GDPR says otherwise, chapter and verse please.

I agree with you that this is a good example of a GDPR challenge. I think that building a profile of user patterns to protect against fraud & abuse is a perfectly legitimate business interest, even under the GDPR. But I disagree that this is a problem unique to the GDPR -- any means of profiling like this runs the risk of discriminating against protected groups or individuals, and there's been plenty of discussion here about e.g. the use of proxy identifiers for race of background employed by universities to filter applicants.

As with all legislation, there is no clear yes or no answer. If a GDPR watchdog were to evaluate your use of this information, they would primarily care whether you (as a company) are aware of the risks involved in the profiling, whether you have spent the effort to weigh the pros and cons of your approach, and whether you have taken steps to sufficiently anonymize such data without making it useless for your purpose. If you have, I don't think you have to worry about retroactive fines even if the watchdog concludes you're violating the GDPR in some way.

Personally, I'd go even further and say that you don't have to honour data deletion requests from users that have tried to defraud you -- it's unlikely they will do so because they would be required to identify themselves to you, after which you can turn them in to the police, but you can legitimately argue that you need to keep their identity on-file to protect your business. I'm sure the GDPR disagrees with me here, but I'd like to see a watchdog test that case in court.

I'm sure the GDPR disagrees with me here, but I'd like to see a watchdog test that case in court.

I doubt it. The right to erasure has never been absolute even under GDPR. Typical examples are that you can't compel a bank to delete all records of a loan it gave you, nor compel the police to delete a criminal record of your past behaviour, as long as the data is lawfully and properly handled.

Is that what you are specifically doing and describing above, or are you just choosing one implication of my general pondering?

If this is actually what you're doing, let's discuss the specific details of the information flow you're using to make these decisions, rather than talking in terms of strong sweeping generalizations. If you're just picking out a worst case implication of my general statement, that doesn't seem very productive. An example of what I was specifically thinking:

Customer buys something from Vendor. Customer never receives package. Vendor refuses to issue refund because tracking marks delivered. Customer files CC chargeback (I know this is less common in the EU but work with me here).

From the perspective of the Vendor this Customer has defrauded them, or at the very least is an increased risk. From the perspective of the Customer, they've been unjustly judged for circumstances beyond their control.

Can the Vendor retain that judgement on the Customer forever? Can they share it with other Vendors to create an industry blacklist of "problematic" customers? These questions seem squarely within the aim of the GDPR.

In the example I was thinking of the situation was much more clear-cut than that. Again I'm not going to get into real specifics because this is legal stuff and it's just a discussion forum, but consider this broadly similar example.

You provide a service that anyone can sign up for. It costs money.

As a matter of good customer service your usual practice is to allow significant grace periods when money owed is overdue before you actually cut a customer off.

Someone signs up for a real account using the name "Mallory One" and then exploits the "generosity" of your system to avoid paying part of what they owe you. Eventually you cut them off.

Someone then signs up for a real account using the name "Mallory Two" and does the same thing again. Again you eventually cut them off but miss part of the payment you were due.

After this has happened several times over an extended period, it comes to your attention that the only people signing up using names of the form "Mallory (number)" are ripping you off and the person or persons responsible have already cost you thousands in unpaid bills.

You add a rule to your security system that says when anyone creates an account with the name "Mallory (number)" you will immediately block it.

How long are you allowed to remember the pattern "Mallory (name)" in your security system if it can potentially be tied to a specific individual and is therefore personal data but you reasonably believe that person to be responsible for all of that fraud and you reasonably expect that they will continue to defraud you if you don't prevent it?

Is this is a practical way of preventing fraud? Can the person not switch their next account from "Mallory Three" to "Eve Smith", thereby evading your rule?

I understand you've simplified the example here for the sake of discussion, but I think the details inform the situation. Like if you really just want to discriminate on any account named "Mallory _____" then that doesn't seem like personal data to me (even though you've created the rule from "personal data"), but also it doesn't seem particularly effective so there must be more to the story.

For an analogous example, you don't need to keep a permanent record of fraudulent transactions with specific IP addresses of 10.0.37.{23,45,67} to remember that 10.0.37/24 is suspicious.

(Also what about everybody else who legitimately has the first name Mallory ?)

Your case is interesting because it contains a few unusual qualities that businesses generally don't offer, but smaller "nicer" businesses will give more leeway. You could straightforwardly stop giving a large freebie to new users or require a payment method or identification on signup, but it would be nice to figure out where the line is instead of just giving in to such less friendly practices.

I'm sorry that it's difficult to discuss these issues reasonably based only on simplified analogies. Again, these are real issues with potentially real police or courts involved if it got serious enough, so there is only so far it's sensible for me to go with any examples.

Yes, the situation absolutely was a practical way of preventing fraud. It saved us a significant amount of money with no apparent downside except for a little time to implement the security measures and the slight GDPR concern we've been discussing. The pattern we were looking for in that case wasn't quite as simple as the name example, but perhaps you'll take my word that it really was almost as obvious but it did also have personal data/identification implications. As I wrote in another comment, it's amazing how dumb people are sometimes but even dumb people can still cause damage. I have a few other examples in mind where similar principles apply and those have also prevented material damage to the business and/or other customers.

Just to explain one detail that might look implausible, the grace period being exploited wasn't for new customers, who do have to pay up front. It was for existing customers who pay late (or, as it turns out, sometimes not at all) when further payments are due. Ironically part of the reason we allow that period beyond wanting excellent customer service is for GDPR compliance. We have an obligation to protect any personal data we hold properly and there is at least a plausible argument that deleting everything the moment an account goes overdue on a bill would not meet the standard.

As you have perhaps guessed, this is a smaller business and we do try to be a "nicer" one. Most of the time I think that is a good thing. However it does mean we don't have dedicated staff or budget for any issues like this. When someone on the far side of the world is trying to rip us off, one of us doesn't get to sleep that night until we've fixed the problem, if we can. Every time we have to spend time and money on compliance changes or taking professional advice and every time the business loses money to fraud, that has rather direct consequences for the personal finances of the people who are doing the work to run the business. We do take security and privacy seriously and we try extremely hard to stay on the right side of any relevant rules (far more than most professionals we talk to expect for a business of our size, and I'm told far more than a lot of much larger businesses with dedicated staff for this stuff).

But it really does boil my blood when people say things like GDPR compliance is easy unless you're doing dodgy things or they assume that because I don't agree it means we haven't thought about it or run a business professionally. If the issues were so simple and obvious, there wouldn't be 16 comments under my original one as I write this without a single citation of either the GDPR or any regulatory or court authority to back up any of the answers given or claims made.

righty-ho. You're in the EU, UK or another country under the GDPR? Have you spoken to an actual lawyer about this? Since, as you say, it's your business.

And this has been a regulation you've been required by law to follow for quite a few years now. Have you just not been worrying about it?

You're asking questions that, as other commenters have noted, are plausibly a valid case, but are quite specific to the precise details of what you're doing and how you do it.

Yes, we took real legal advice in good time. We also had some time with a specialist in GDPR compliance and eventually spoke with the regulator in our country. While I'm obviously not going to discuss specifics here, nothing was hidden from any of those experts. And we are still not 100% clear on what is theoretically allowed here.

This is my point. Literally no-one actually knows whether these kinds of edge cases are permitted under the regulations until you're already at the point that someone in a regulator's office has initiated a formal action to find out and potentially penalise you if they're not.

Are there not any other ways to secure your system? Seems a bit off to me that some personal info is all that is needed to try fraud or 'attacks'.
You would be amazed how dumb (and yet still dangerous and disruptive) some people can be. Here is an example without getting too specific.

We once had a series of attacks by the same group. They would sign up for real accounts on our site and then take certain actions that violated our terms of service. Everyone here would agree those violations were serious enough to justify immediate termination and potentially reporting to relevant authorities.

Every time they signed up there were certain patterns in the details they gave that allowed us to recognise them. Those are the kinds of data we intend to keep indefinitely so that our security system can intercept any further attempts (which still happen sometimes) and block them.

IP addresses are the problem. Those are pretty important in trying to find bad actors. They are widely stored for a long time to to be able to identify various forms of abuse. But GDPR considers IP addresses to be personal data, as it is potentially possible that one identifies a unique individual.

For example, most classic forum software stores the IP address of a post submitter indefinitely for anti-abuse reasons. It seems like nobody running such forum software could ever be GDPR compliant. This is despite them never selling this data, trying to mine it for any nefarious purposes or anything like that.