| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ROFISH 923 days ago
	So if you delete your image the entire trained data set is invalid because they no longer have license to the copyright?

4 comments

notatallshaw 923 days ago

If having copyright were a prerequisite of training data this would be true.

But in the US this hasn't been tested in the courts yet, and there's reason to think from precedent this legal argument might not hold (https://www.youtube.com/watch?v=G08hY8dSrUY - sorry don't have a written version of this).

And the lawsuits so far aren't fairing well for those who think training should require having copyright (https://www.hollywoodreporter.com/business/business-news/sar...)

link

JAlexoid 923 days ago

I would imagine if we use a very strict interpretation of copyright, then things like satire or fan-fiction and fan-art would be in jeopardy.

As well as learning, as a whole.

Unless there is literally a substantial copy of some particular piece of copyrighted material, it seems to be a massive hurdle to prove that analyzing something is copyright infringement.

link

slaymaker1907 923 days ago

Most people in the fanfiction community recognize that it's probably not strictly allowed under copyright. However, the community response has generally been to do it anyway and try to respect the wishes of the author. Hence why you won't find Interview with a Vampire fanfiction on the major sites.

If anything, I think that severely hinders the pro-AI argument if fanfiction made by human authors are also bound by copyright.

ETA: I just tested it out and you can totally create Interview with a Vampire fanfiction with Bing Compose. That presumably is subject to at least as strong copyright as human authors and is thus a copyright violation.

link

shagie 923 days ago

I would suggest also a read of https://en.wikipedia.org/wiki/Copyright_protection_for_ficti...

> Copyright protection is available to the creators of a range of works including literary, musical, dramatic and artistic works. Recognition of fictional characters as works eligible for copyright protection has come about with the understanding that characters can be separated from the original works they were embodied in and acquire a new life by featuring in subsequent works.

Creating a work using Harry Potter or Darth Vader or Tarzan ("As of 2023, the first ten books, through Tarzan and the Ant Men, are in the public domain worldwide. The later works are still under copyright in the United States.") is a copyright infringement.

You may also find https://www.hollywoodreporter.com/business/business-news/dc-... interesting as well as the entire legal saga of Eleanor.

---

Creating Interview with a Vampire fan fiction with Bing - Bing didn't have any agency. The question of copyright infringement (I believe) should be only applied to entities with agency to (or not) ask for copyright infringing works.

link

pr337h4m 923 days ago

> Creating a work using Harry Potter or Darth Vader or Tarzan is a copyright infringement

Transformative works are a thing:

https://www.transformativeworks.org/faq/#:~:text=investments...

https://www.transformativeworks.org/faq/#:~:text=Open%20Door...

link

mr_toad 923 days ago

> I just tested it out and you can totally create Interview with a Vampire fanfiction with Bing Compose.

That’s the output of the model, it doesn’t have much bearing on the copyright status of the model.

link

ClumsyPilot 923 days ago

> if we use a very strict interpretation of copyright, then things like satire ... would be in jeopardy.

Satire, criticism, reviews and journalism are explicitly permitted under fair use.

If I wish to publicly express my disdain or praise for your art, it is necessary that I can show samples / pictures/ photos when I express whatever my deal is.

link

kjkjadksj 923 days ago

The difference is when writing satire its not strictly necessary to possess the work to do so. You can merely hear of something and make a joke or a fake story. Training data on the other hand uses the actual material not some derivative you gleamed from a thousand overheard conversations.

link

dragonwriter 923 days ago

> So if you delete your image the entire trained data set is invalid because they no longer have license to the copyright?

The portion of the training set might. The actual trained result -- the outcome of a use under the license -- would, at least arguably, not.

Of course, that's also before the whole "training is fair use and doesn't require a license" issue is considered, which if it is correct renders the entire issue moot -- in that case, using anything you have access to for training, irrespective of license, is fine.

link

panarky 923 days ago

Let's say you post an image, and I learn something by viewing it, then you delete the image. Is my memory of your now deleted image wiped along with everything I learned from viewing it?

link

dylan604 923 days ago

I have seen plenty of images on the internet where I would gladly accept this as thing. Unfortunately, what's been seen, can't be unseen.

link

opello 923 days ago

Unfortunately computer memory, unlike your memory, is so easily wiped. Having the infrastructure in place to make sure it happens on the other hand, seems more like human memory.

link

KaiserPro 923 days ago

Now that is a multi-million dollar question.

How derived data is handled after copyright is revoked is a question thats hard to answer.

I suspect that the data will be deleted from the dataset, and any new models will not contain derivatives from that image.

How legal that is, is expensive to find out. I suspect you'd need to prove that your image had been used, and that it's use contradicts the license that was granted. It would take a lot of lawyer and court time to find out. (I'm not a lawyer, so there might already be case history here. I'm just a systadmin who's looking after datasets. )

postscript: something something GDPR. There are rules about processed data, but I can't remember the specifics. There are caveats about "reasonable"

link

grogenaut 923 days ago

s/m/tr/

link

klyrs 923 days ago

> Now that is a trulti-million dollar question.

Huh? I think you want s/(:?m[^m]*)m/tr/

link