Hacker News new | ask | show | jobs
by jeffbee 339 days ago
I would say that in general the HN crowd doesn't understand the industry at all, and they need to change the direction of their understanding, rather than the magnitude. Your basic hackernews believes that e.g. Google is out there selling all your personal information. But compared to these other industries the tech industry is almost airtight. It has long been possible for someone to pick up the phone and order, in any format they want, transaction data as narrowly targeted as they wish. Credit card line items for 35-year-old dentists living on the 400 block of Elm street in local town? By end of day.
10 comments

This is correct; what people fundamentally misunderstand is that data brokers directly sell personal information about people, but Google and Facebook only allow for targeted advertising while keeping personal information within the confines of their company.
This isn't misunderstood, just not relevant. Google sells to a funnel that plays a numbers game, not for individuals to be targeted.
The meta-conspiracy-theory would be that the dossier industry whips up conspiracy theories about online advertisers in order to maintain their own low profile.
It has been truly frustrating when people will blame the "tech industry" for what is essentially reckless behavior from other industries. For a while, it was often the finance sector that did most of the crazy stuff. With crypto being an obnoxious overlap of the two.
Data brokers are the OG tech industry. They've been around since the late 60s selling consumer data. Just because it's unsexy data storage and query work doesn't make it less tech.
I mean, somewhat fair. But when people decry "big tech," they aren't talking about these companies.
It has been truly frustrating when arms dealers are punished for what is essentially reckless behavior from warlords, dictators, and drug cartels.
I'm not entirely sure I follow? Most of the "arms" that people love to complain about aren't created or sold by "big tech."

Some of this is that people are rather wrong about just how much smarter some of the big consumer tech companies are.

I'm also surprised that this is so hidden from everyone. Where are the engineers leaking secrets? Much of the online discourse is pure speculation based on what can be observed from the very end of the chain. (ie, what your computer is giving up) The speculation is not necessarily _incorrect_ but is too vague to be useful to anyone. Where does my data _actually_ go? Does anyone know? Can anyone describe the life of my data as it goes through the whole ecosystem? Does anyone know what mitigations are, and are not effective?
Because what's the headline you're going to get out of it?

If the headline is "Mark Zuckerberg is amassing your data and you know it's for evil", it's an easy sell. If it's "there's an ecosystem of little-known companies that sell transaction, location and lifestyle data to marketers, journalists, PIs, and police departments alike", it's not exactly the kind of a message that spurs people to action. And yeah, the newspaper that would be breaking the news is a customer too.

Despite being near universally hated externally, data brokering is a boring industry and is seen as very mundane and routine. They don't attract the type of engineers that have a strong moral stance and will go rogue and blow the whistle. They attract the middle age suburbanite just trying to get through the day and make a living.
> Your basic hackernews believes that e.g. Google is out there selling all your personal information

To add to this, any mention of "telemetry" is taken to mean your PII being taken by bad actors to abuse, instead of what it is in 99% of cases, which is usage statistics. (X% of our users use feature A, it merits investment). It can be both, but there's usually no place for differentiation, just pitchforks.

> It can be both, but there's usually no place for differentiation

Fool me once, shame on you. Fool me 153,927,861 times, shame on me.

The place for differentiation, the place for "oh this is probably fine", the benefit of the doubt is, of course, lost.

Because someone (you? people shaped like you?) who misuse telemetry destroyed trust.

> It can be both

should instead be "it usually is both and you the user have no way to know anyway."

The industry betrayed consumers' trust to the point where no project can be trusted to be mindful of data anymore. Even Proton Mail ended up ratting to the French, and that was just IP and session info, so who can we even trust to get "good telemetry"?
> Even Proton Mail ended up ratting to the French,

Answering to court orders isn't "ratting". You either answer court orders or go to prison.

Or they architect their system better so that they never collect the IP addresses to begin with. I think Privacy Pass and other things Mullvad is doing help in this area, but I am not aware of Proton working with them to implement anything like this. But Proton should do this, because it’s relevant to customers of Proton.

https://discuss.privacyguides.net/t/privacy-pass-the-new-pro...

Apparently not Privacy Pass related, will keep looking as I seem to remember that Mullvad was doing that implementation, but I may remember incorrectly.

https://discuss.privacyguides.net/t/mullvad-has-partnered-wi...

I don't think it is common to refer to server logs as "telemetry".
Logs aren't telemetry and calling a response to a court order "ratting out" is exactly the kind of behavior that makes people increasingly skeptical of privacy advocates.
Is that actually possible? Can we do a live test here?

Let's say we want this dataset: Credit card line items for 35-year-old dentists living on the 400 block of Elm street in local town

How much do I have to pay you to get it?

i think it could be feasible to get an ad in front of "35-year-old dentists living on the 400 block of Elm street in local town" who has bought product X but i've never seen a transaction by transaction purchase history being for sale.
How much you got?

Never ask a sales person how much yo have to pay when the prices are not already clearly stated. Tell them how much you are willing to spend to see if they will do it for that amount. Sales people will always shoot high hoping to not leave money on the table. The price might change depending on how much you squeal and how high they shot. Your initial "willing to spend" should also be lower than you're actually willing to spend for the same but converse reason

Ok, so nobody here knows directly of any case where such data has been purchased, or vaguely similar, and we have no pricing information whatsoever available, but we are somehow completely knowledgeable about it being possible and how to do it? That sounds unlikely.
The supposedly in-the-know responses here are full of bravado but not much other than "trust me, bro"
https://news.ycombinator.com/item?id=44565878

Yea, you know everything, don't you.

Wow the Transunion business site, that really proves it huh.
Yeah people fail to provide examples but continue to be doomers about how easy it is.
Been busy, but since you seem to be unable to find any body by searching on your own for the past 6 hours, here's something I found with a quick little search.

https://datarade.ai/data-categories/food-grocery-transaction...

Have we really lost the ability to use search functionality??

Of course people do. 5 seconds spent doing the most sparse-ass research will help you find plenty of stuff. If people don't respond, I imagine, for fear of 1) outing the specific area they work in, or 2) realizing these kinds of comments aren't generally acting in good faith so it is generally a complete waste of time.

I'll waste my own time and give a trivial example just off the top of my head. Go peruse some of the products offered on this page, put on your thinking cap or even look into them further and imagine what kind of data those services provide, where it likely comes from, and where it is sold to, and you'll be well on your way - and those are just the ones that are advertised openly.

https://www.transunion.com/business

Pretty much every one of the big players people typically associate with other areas such as personal credit have some feet in this space somewhere. Then theres the hundreds of lesser-known fly-by-night guys that have their own DB's they build off of mostly what is the same data, but correlated in different ways and sold to different audiences.

There are many, many services offering data-for-sale on practically anything to practically anyone. I heard of one recently claiming it can reliably determine someone's porn preferences. The fact you personally have never come across it, or are saying you aren't, is only a data point that is interesting to you, and no one else that actually knows what they are talking about in this space. Hope this post helps you somehow.

I didn’t ask for a link to a company that can do it. I want pricing. I am saying that nobody here is willing to share anything even approaching specific pricing, which makes me very much doubt that any of them have the direct transaction experience they are claiming. I don’t doubt that underwater welding exists, but I do doubt that anyone in this thread has done it, or has any direct experience with it.
>There are many, many services offering data-for-sale on practically anything to practically anyone. I heard of one recently claiming it can reliably determine someone's porn preferences.

Okay but then why not name at least a couple such services. Also, if the tech industry isn't selling data to them, where do they obtain it? Again, I see lots of ambiguity here, and the example link from transunion is hardly revealing of anything.

Credit card companies are known to sell data. https://www.cbsnews.com/news/mastercard-credit-card-customer...

Mobile service providers are known to have sold data. https://www.fcc.gov/document/fcc-fines-largest-wireless-carr...

Auto makers are known to sell data. https://www.caranddriver.com/news/a61711288/automakers-sold-...

You act like it doesn't happen, yet time and time again we learn about companies selling whatever data they can collect.

I can't believe we are still questioning this fact

What else do you need to know?

Literally all anyone is asking for is one single concrete example of a site where you can roll up and buy personal information.
But what type of range are we talking? Tens, hundreds, thousands?
It could also mean that if you have to ask... or the first rule of data brokering...

Seems like the first thing to do would be to get an account with one of these data brokers. I'd imagine most of these places are "contact us for pricing" so they can play used car salesman games

Or, you could ask John Oliver to do it for you and then tell all of us on one of his episodes exactly how in depth it could get. They have the money to do this, and it seems like something right in his team's wheel house

If you need John Oliver to do it maybe it's not such a big problem? If no one here is able to provide a single concrete example, maybe it's not real?
John Oliver likes to spend HBO's money to do things others can't do while entertaining the rest of us. I'm not spending my money on something to prove what is known as possible for you. At this point, even with receipts, you're coming across as someone that would argue that grass is not green, or water isn't wet, and fire isn't hot.

Just because someone doesn't answer your belligerent questions does not mean it's not possible. It probably means that the people that are doing this with first hand knowledge have too much to do than trying to convert doubting Thomas over here.

> Credit card line items for 35-year-old dentists living on the 400 block of Elm street

I do not believe that. I would like evidence before I am convinced

If my bank is releasing that data I am horrified. I live in anew Zealand and our privacy laws are clear: it would be illegal

Backwards in 2025, ask for proof it is not happening. It’s the POS terminal that actually sells the data, btw. NZ may be “behind.”
Same comment for PoS

We have strict privacy laws

That would put us ahead?

Not an expert in NZ law, but they’d have to be comprehensive to completely stop it. Legislators are not typically savvy in the area as well.
The law is very clear (IANAL but I was responsible once for compliance with this law)

If I collect your private information for a purpose, I may only use it for that purpose. I may not sell it

So if I have your transaction details I may use them to complete the transaction, no other purpose

I think the HN crowd is especially vocal about the tech industry in particular because that's the industry a lot of us have first-hand knowledge of - we know from personal observation that it is anything but airtight
> Your basic hackernews believes that e.g. Google is out there selling all your personal information.

I think most people here understand that Google sells ads against that data, but they aren't selling the data.

Okay, and who are these people you contact for this data, and how do they themselves obtain it so precisely? You say the big tech industry is pretty air-tight about sharing data, so how does mysterious X company have on hand the credit ratings of all those youngish dentists on Elm street, among other kinds of information? How o these dynamics work, since you seem to know it internally?
A mobile provider enters into marketing sharing agreements with credit card companies. It extracts housing information from local property and tax records. It enters into marketing sharing agreements with retailers, payment processors like ADP. Same with license plate reading companies, loan companies, banks, professional organizations, etc.

It fills its data lakes with the vectorization and down tilt data that it collects every day. It uses federated batched Hadoop tasks to join the above data lakes into one large data lake. Mid-PB in size.

Then it looks for mobile phones that travel to the 400 block at night and stay there, that are buying dentist stuff from Walmart, travel to a dentist office every workday, have an income over $120k, and are a member of the local dentist society. Maybe look for someone with dentist student loans, graduated with a dental degree.

None of those data points can identify an individual. Taken together they can ID just about anybody.

But maybe there is a chance that you ID their wife/husband. So maybe include/exclude people that regularly visit OBGYN offices.

Back in the day we could link cell numbers to credit card purchases in locations to the point of being to identify the name of the person and what they purchased and where it was purchased. For all people in a metro area that were using credit cards and physically visiting stores.

Anyway to opt out of this type of data collection per company? I know for some things you can contact each individual broker and opt out (via some identifier like your email address) of your data being at least publicly available