Hacker News new | ask | show | jobs
by throwaway76543 2806 days ago
No, the data is not up for sale. As the article you linked clearly states the data is anonymized and cannot be tied to individuals.

It's simply not a thing. It would violate PCI-DSS.

4 comments

"As the article you linked clearly states the data is anonymized and cannot be tied to individuals."

You should never believe that for two reasons:

1. Companies often get away with lying about that.

2. De-anonymization techniques, esp if having multiple sources of data to cross-compare, are improving every year thanks to an active, research community.

Most incentives work against your privacy. Most companies act on incentives. Best to just not share your data if you're concerned about where it might end up.

With regard to #2, I covered in my comment to the poster your replying to that correlating partial names and the store location with registered voters from your state's freely available voter registry would de-anonymize most transactions in the dataset.

Nevermind that Google has location data for Android users (and Google Maps data on Android), and better profiles of people than the voter database has. Their attempts at user de-anonymization will likely be even more accurate.

Short of reducing the transaction dataset to purchase amounts and dates, or simply respecting customers privacy and not transferring this data, Mastercard is not anonymizing this data.

Google is using this data to link online ads they display with in store purchases, thus Mastercard is giving them enough data that they can correlate who views an ad from Google's ad network with their in store purchase. Partial names would suffice for this purchase, combined with store location for the transaction, one could de-anonymize most customers in a given dataset by correlating using freely available datasets like the state voter registry (which often provides full names, addresses, elections voted in & more).

Nevermind that Google has location data for Android users (and Google Maps data on Android), and better profiles of people than the voter database has. Their attempts at user de-anonymization will likely be even more accurate.

I think a good way to describe it is that PCI-DSS just removes one potential data point out of multiple that are used in a triangulation effort. Loss of any one data point doesn't necessarily ruin the triangulation effort.

Some of the data will also work for a cash system. And it's not just Google or other apps on their phones that people have to worry about now. (Not a popular opinion, but Google is generally a lot better than others at data protection because it doesn't often sell the data on; it works as an interface to the data and if it sold the data outright their other services would be less useful. So it has incentive to protect a lot of what it collects that other companies do not.)

Some surveillance systems recoup costs by selling off data about license plate and other sightings to companies like TransUnion who will aggregate the results and sell them on.[1].

The ability to narrow the focus is a bit lower, as it can't tie you to a specific store unless it has a dedicated parking lot (which is fairly rare and not easily predictable).

But if you look at what else is possible when all surveillance is up for sale, it's a bit scary how easy de-anonymization can potentially be.

Take the same sources of data to what's already actively sold and resold to/by TransUnion, and add Facebook/Google tech for identifying people instead of numbers seen on vehicle plates. Anyone who walks by a camera, which are ubiquitous in any marketplace, and who has photographs of themselves and whose identity has been tied to that image by one system, could have their location detected and further sold on to whoever wants it.

This doesn't even need to be a voluntary release of the data like tagging photos on Facebook, or actively collected by other people volunteering it by tagging you.

As an example, take something innocent like a trip to the grocery store. Even if they are not selling their parking lot surveillance to TransUnion for license plate tracking, they could potentially be selling or aggregating other data for the same purpose. The self-checkout line camera which they might show you very pointedly to discourage dishonesty isn't limited to their self-checkouts. But the footage of you in that location is a great training source for a recognition algorithm.

And right beside it is a register machine to tell the company your name, either from the credit card transaction or some other way. They might have your name on a reward/discount card that they insist you use to get discounts (or to put it another way, to avoid paying extra to opt-out, since these items are hard to identify as a promotion -- they are promoted the same as regular price, often have no competitor, and non-promotional pricing is obscured). Either of these can correlate the video/imagery to the transaction and a person's identity.

It wouldn't take a lot to build an image recognition profile.. just repeat the process a few times and it will probably be good for years even if it isn't reinforced with new data. Collecting information used for stuff like this becomes dead simple without significant data protection laws.

The recent Forbes article is a great example of how current systems like TransUnion's TLO unfortunately have seriously low barriers to use that have enabled things like identity theft [2]. More sophisticated uses are equally possible, highly probable, and likely much less detectable, such as politicians tracking behavior of political rivals and their operatives, or actors working on behalf of foreign governments to track politicians and military figures.

[1] https://www.tlo.com/vehicle-sightings

[2] https://www.forbes.com/sites/thomasbrewster/2018/10/12/how-a...

- Google has Android data who was where

- Google buys transactions data, only timestamps when payment was made

- ???

- Profit

If you're correct, then the even more correct version is "the data is not up for sale yet."