Hacker News new | ask | show | jobs
by next_xibalba 624 days ago
None of it is accurate and almost all of it is modeled from sparse, low quality training sets. Banks are not selling PII’ed account balance data to shady aggregators.

To me, the more interesting and outrageous story is how many aggregators are able to sell garbage data so successfully.

7 comments

Banks are not selling PII’ed a

You know how some banks have a service which tells you how you spend your money? With graphs, 20% on power, 15% on food, etc?

That service is provided by a third party, who is given the data anonymized. A unique id number assigned. Yet it's trivial to deanonymize, and that's what happens.

All that is required is one buy with a points card, an airmiles card, and you are forever relinked to your data. It's how points cards make cash on the side, how air miles do. Exact time, date, amount, location of purchase is a great sync method.

If you pay for your phone with any form of traceable payment, they know who you are, your address, etc. From this immense data is gleamed, such as lot value, neighborhood, and so on. Companies can even get current location and geofence you, being alerted if you move in/out of a certain location.

Mobile phone companies sell this data/service via an easy api. Companies relink a phone from the app level via IMEI and number, which is sold to aggregators along with phone data (contacts, etc). The telco api links to real identity.

Once linked, forever linked.

Most people love free apps, and give up messages/sms, contacts, and more to save a dollar on an app. From this immense relationship data is gleamed, including likely employer and social circke.

Even if you are careful with your app permissions, certainly many acquaintances of yours aren't, so you get linked to their social circle, often with contact name/address.

This is just the simple stuff.

Source: I've dealt with these companies.

>Banks are not selling PII’ed account balance data to shady aggregators.

But is Plaid?

And banks do sell account balance data, they also sell credit and debit transaction history

> But is Plaid?

Or any of those budgeting apps that integrate with your bank account.

That's probably the signal. But as one of the parent posters said, the # of folks who use such budgeting apps is quite small. For advertising, small samples are useless, so this data has to be modeled to the full US population.

For that, this very biased training set. And almost always the independent variables used for modeling are 7-10 standard demographics.

Seems like Plaid would be f’d six ways til Sunday if it got out that they were selling consumer data to 3rd parties, no? A huge part of their business model is based on trust and doing that would completely burn it.
Sorry, maybe “third party” isn’t the correct term. Let me try to lay out my point a bit more clearly:

Plaid’s business model is — Company A needs a consumer’s data from Bank B. Plaid takes the consumer’s banking credentials, gets the data, and sells it to Company A.

At no point in this process does Plaid go and sell this data to another unrelated Company C. The lawsuit cited was about Plaid not sufficiently explaining its position between Company A and Bank B to the consumer. It was not about Plaid going and selling the data to the highest bidder.

How do you think they made money? It certainly wasn't from licensing their SDK that intentionally spoofed 3rd patry banks in a way that deliberately misled users into assuming they were logging in with the bank directly instead of handing Plaid an access token that allows them to exfiltrate arbitrary transaction histories.

Any time you hear yourself utter the words "Wouldn't x be f'd if word got out that y"... You need to stop and consider that there is an entire industry around reputation management, and PR crisis management that is leverageable by the deep pocketed in order to keep their name out of news items, and that the favorite acquisition of the absurdly deep pocketed is the media outlet/platform.

Think. The world is full of scummy people looking to make a buck, and a much more pauce number eho worry about doing so honestly. Until you meet one of the rare ones who falls on their sword for their ideals, never assume the guy on the other side of the table is one until proven through deed.

They make money through the fees they charge companies that pay for their service, so that they can get banking data from their consumers. Those fees are not cheap, so I do imagine they are doing most of the work to sustain the business right now.

I’m not saying “you should trust Plaid with your data” — absolutely, 100% not that. I imagine that’s how I’m being interpreted, hence all the downvotes.

What I’m saying is that at the present time, it does not seem to me that Plaid would be incentivized to do something that they explicitly say they are not doing. Plaid’s business model is, trust us to get your customers data and deliver it to you, and only you, safely. Selling it to Bob down the street on top of that would threaten their primary business model. And today, that primary business model is doing very well! So why threaten it?

Now, someday in the future, maybe that business model has stagnated, and line still needs to go up, so someone may get greedy and that may change. In fact, this is even likely to happen! But there will be signals that it is coming.

Even re: the issue of misleading users that they are not their bank — after they got slapped down on that one, their strategy changed. There is a new set of regulations around disclosure around these things, and Plaid is pushing them pretty hard. My guess is they had some hand in drafting these regs and are hoping to use a higher regulatory burden to build a moat against competitors.

But honestly, I’m kind of surprised at the lack of nuance in understanding how Plaid works, especially here on HN.

The value prop of Plaid, Yodlee, et al is that they can do this with one(-ish) API surface for tens of thousands of financial institutions. In their efforts to ensure Bob down the street won’t be sold any data, they do treat each customer (of the API, not the end users they pull data on behalf of) as an isolated tenant.
Pretty much no corporation in the last 40 years has suffered the consequences of their actions. Boeing has killed how many people and it's taking an act of Congress to even start talking about some consequences later, maybe.
Arthur Andersen went under after its accounting negligence: https://en.m.wikipedia.org/wiki/Arthur_Andersen

A few food companies have failed due to poor quality control: https://www.thestreet.com/retail/another-popular-ice-cream-b...

In fact, many companies go bankrupt every year: https://en.m.wikipedia.org/wiki/Bankruptcy_in_the_United_Sta...

>Pretty much no corporation in the last 40 years has suffered the consequences of their actions.

There's hundreds of regulatory actions taken by governments per year. That's "consequence" by definition.

Fines of a few percent of the revenues generated aren’t enough of a deterrent.
That logic suffices as truth to you?
> None of it is accurate and almost all of it is modeled from sparse, low quality training sets. Banks are not selling PII’ed account balance data to shady aggregators.

Part of the problem though is that much of this data is persistent, across order-of-human-lifetime.

How often does your employer salary history have to be obtained to be useful? Maybe once every 10 years?

I have zero faith that in jurisdictions without national laws prohibiting it (and laws that prevent usage of extra-national data) that's not happening.

> Banks are not selling PII’ed account balance data to shady aggregators.

Banks might not be directly selling the transaction history, but they report the customer transaction history to Equifax and similar credit scoring agencies. Equifax certainly does onsell that to shady credit companies, which has happened to me twice with letters in both cases stating in the footprint in a very small font size and in a very pale hue of grey «provided by Equifax».

Maybe they are using garbage data, but at least for the credit checks, he was running them on-demand at $0.75 a pop. He also mentioned browser fingerprint databases that he has purchased. Half of his job seemed to be processing and importing different databases that he had purchased.
I use an app called PayTM for online payments. It shows me notifications that I have rent pending on a flat which i rent when I have NEVER used it to pay rent ever. It also shows me that I have pending electricity bills. It also picks up and shows me data on how much credit card payment is due when I have never used it to pay credit card bills.

All of this information can come only through cooperation between banks, credit reporting companies, utilities etc.

Any ideas on how I can make my metrics tank predictions for I stop being marketed to so aggressively?
Second. Had to get a spam blocker because I was getting like 5-10 calls/day from “debt consolidation” companies which is a significant distraction

The spam blocker is pretty powerful though, you aren’t getting past it unless you are in my contacts or have a # flagged as affiliated with a reputable business