Hacker News new | ask | show | jobs
by totetsu 982 days ago
I am constantly surprised of how prescient my Media studies professor was back in 2007 about how everything has shaking out since then. "Your data is valuable don't give it away" is ringing in my ears as I give all my data away to openai
4 comments

Your individual data isn't valuable. Your data bundled with the data of thousands-to-millions of other people? That's valuable.
Individual data is extremely valuable, they wouldn't collect and store it individually if all they wanted was the aggregate.

Identity-specific data is sold all the time for everything from advertising to credit scores and what would otherwise be called social credit scores of they were run by the government instead of private companies.

It really isn't on average. An individual dataset is usually cheap - or are you saying there is a mispricing?
I'm not really sure what would be considered cheap vs valuable here so I don't have a great answer there. My point was only that individusl data is collected and sold often, it must be valuable or there wouldn't be a market.

I'd be curious to see how the price compares when selling 10k individuals' data versus aggregate data the 10k people. Presumably if it was cheaper to buy all the individual data I would do that and aggregate it myself.

The data is only useful in aggregate, but different people/use cases require different types of aggregations. Using pre-aggregated data is difficult, because it almost certainly hasn't been aggregated in the way that's convenient for whatever analysis you're trying to do with it.
The aggregate data is often more useful in commercial use cases, but plenty of use cases need the indivual data as well.

Private investigators, three letter agencies, and any company wanting to send mailers to my new address when I move all need the individual data to target me specifically.

I totally agree the aggregate data is given more value in a commercial market heavily focused on advertising and now training LLMs, my only point was that there are markets that highly value individual data as well.

The data is also useful individually. Jewish and single? Try JDate.
In aggregate, you individual data is pretty valuable.
incidentally

>What if I want to keep my history on but disable model training?

We are working on a new offering called ChatGPT Business that will opt end-users out of model training by default. In the meantime, you can opt out from our use of your data to improve our services by filling out this form. Once you submit the form, new conversations will not be used to train our models

https://docs.google.com/forms/d/e/1FAIpQLScrnC-_A7JFs4LbIuze...

Even in 2007, he was ~50 years late.
Data can be copied. Giving your data away doesn't make it less valuable to you.

(There might be other concern, eg around privacy, about giving your data away. But worrying about value isn't really one of them as an individual.)

Sure it does. A secret stops becoming a secret once you give it away to enough people. I feel like you're trying to justify piracy but picked the wrong argument.
I'm talking about personal data. Piracy is rarely a concern here.

(And btw, piracy only makes the data less valuable, because you lose the ability to sell it to someone who already has it. Not because the game or movie etc would become less enjoyable.)

Data can be copied. Giving your data away doesn't make it less valuable to you.

Data is a moat. If you share it you're giving your advantage away.

That doesn't necessarily follow. Film data can be copied, but production companies don't give them away.