there's a difference between a book and data or music and data. that is their data. if you have a painting and i take a picture of it and store it on my drive. it's my data, i don't own the copyright to it tho, but it's my data and not your data even tho it's a picture of your painting.
When you posted the picture to myspace under the terms of their user agreement you granted them unlimited rights to redistribute that image to anyone in the world.
If you care about privacy don't post private stuff online.
Yup. That's your data now. And also mine (if I have a backup) and also myspace's.
The fact that makes it your data is that you physically can share it with someone else.
At least that's the value system I live by and I believe should be in place for all because it perfectly reflects the reality of what happens with ones and zeroes.
I'd say that it'd be your data but you might not be the copyright holder. But if the data is on a storage media that you own, I would consider it your data.
Lysenko as in the Soviet scientist? I don't really see what, if anything, a mistaken belief about evolution has to do with legal or moral definitions about ownership of data.
Saying "Lysenkoism is true" is factually wrong, but saying "physical possession is equivalent to ownership" is just a very fringe political opinion.
So I don't see how "the GDPR" can be wrong, unless you mean it in the sense of "the death penalty is (morally) wrong", which is just your opinion in that case.
My point is this: If your insurance provider, for example, obtains access to your medical records, and store them on their servers, does that make it "their data" to use as they please? This would imply that:
> But if the data is on a storage media that you own, I would consider it your data
Where did you find that picture? If the person printed it out and plastered it on a nearby signpost for everyone to see, I'd say it is no longer personal data.
I'm not sure why you're being downvoted when You're just describing typical Internet behavior. How many archive or search engines have come and gone that have scraped, saved, and served data from other sources (verbatim no less) with little to no scrutiny?
You argued that gathering of data signals ownership of it. But I don’t know that reasonable people would agree that that’s about framing.
If you’re going to argue data ownership at all, it seems to me the creator of the data is the owner, unless transfer ownership to another person or to the public domain.
On the other hand, I can understand a stand that data can never be “owned”, but I don’t think you are saying that.
They put in the effort to compile and serve the dataset. That is the useful thing in regard to LLMs.
Particularly when it comes to training AI it's not at all clear to me how traditional copyright benefits society at large. Obviously models regurgitating works wholesale would be problematic. But also obviously models are extremely useful tools and copyright is largely an impediment to creating them.
> You argued that gathering of data signals ownership of it. But I don’t know that reasonable people would agree that that’s about framing.
First of, I am a very reasonable person so you already have one. Second of, even in our sick information economy, public data can be owned when gathered in a database by a third party. The company that created the database can sell access to it and go after people that re-publish the database. Even though it consists 100% of public and free data.
> If you’re going to argue data ownership at all, it seems to me the creator of the data is the owner, unless transfer ownership to another person or to the public domain.
If you go by what's natural, instead of by "please, institutionally protect my obsoleted business model", the creator has the sole ownership of the data until he transfers the data to someone else. If he made a copy and gave it to someone, now they both have the ownership. If he just gave away the data now there's a new single owner of the data. Then IP ownership would work just like ownership of every other actual thing in the universe.
> On the other hand, I can understand a stand that data can never be “owned”, but I don’t think you are saying that.
Oh, it definitely can be owned. I own all zeroes and ones on the computer that I own. Please don't steal them and don't tell me what I can do with them.
I mean yeah, since its the privatization of data but I think the spirit is that data itself doesn't belong to anyone but rather what you can hold is yours? I don't know, it was a tongue in cheek comment and now I'm actually thinking about it.
> I think the spirit is that data itself doesn't belong to anyone but rather what you can hold is yours?
It definitely belongs to someone. To the person holding it (provided that it wasn't stolen). Just as any other actual thing. Except for borrowed items.
I don't know if I'm misunderstanding you, but tons of actual things don't belong to the person "holding" or using it. Leased cars, rented houses, work equipment, stolen items. It is a huge simplification saying that "anything belongs to the person holding it, except for borrowed items", which ignores a bunch of history and legal precedent establishing exactly what it is people mean when they say somebody owns something.
Your definition of data ownership certainly is a definition, but it's far from obvious or mainstream. If you texted an intimate photo to an ex, do you consider them as the owner of the photo, meaning that they're allowed to do whatever they want with that photo (as ownership typically implies)?
At least this isn't saddled with a profit motive and the destruction of the consumer computing market.