The important distinction is that the model does not contain a copy of a copyrighted material. It contains data trained on copyrighted material (like our brains).
You're making a distinction that isn't reflected in copyright law.
If I take a JPEG of Mickey Mouse and then turn it into a PNG, it's not a "copy", as the bits are different. But it still contains copyrighted material.
You can try to argue that the bits of PNG itself aren't an image of Mickey Mouse, but rather the algorithm that reads the PNG produces an image of Mickey Mouse. But that isn't really a relevant distinction in so far as copyright is concerned.
In addition, this statement is false:
> The important distinction is that the model does not contain a copy of a copyrighted material.
It has been shown repeatedly that the model produces copies of training data. The copies are of course not stored as JPEGs/PNGs in the model, but they are retrievable from the model, given the correct password (prompt).
Could you provide evidence of your last statement? I haven’t seen these models produce actual copies of any art (can’t imagine that’s an option in general).
These models do not contain copies. One way to describe the data is they contain a statistical breakdown of the artwork, which is substantially different from a JPEG -> PNG conversion you mention.
> Could you provide evidence of your last statement? I haven’t seen these models produce actual copies of any art (can’t imagine that’s an option in general).
> These models do not contain copies. One way to describe the data is they contain a statistical breakdown of the artwork, which is substantially different from a JPEG -> PNG conversion you mention.
I don't understand the distinction you're making. What legally separates a "statistical breakdown" representation from a zip file representation, JPEG representation, PNG representation?
As I anticipated, you are referencing research that does not show exact copies being generated by Stable Diffusion. Do "semantically equivalent" images infringe on copyright? I would argue that they do not. We will see how this plays out in court.
Food for thought: if I write instructions for generating an SVG of a black square, does my program contain copyrighted material (Malevich's Black Square)? You and I could argue about that, but you will probably quote more research that disproves your own point. So let's skip that.
> As I anticipated, you are referencing research that does not show exact copies being generated by Stable Diffusion. Do "semantically equivalent" images infringe on copyright? I would argue that they do not. We will see how this plays out in court.
If you're convinced that a photo of Mickey Mouse with slightly larger ears, or slightly reddish pants isn't copyright infringement, then sure, neither is any of this stuff. It would also mean that republishing copyrighted images with lossy compression algorithm (IE, JPEGs) would also not violate copyright.
I would suggest looking at the actual laws around copyright instead of relying on what you feel copyright should be.
> If you're convinced that a photo of Mickey Mouse with slightly larger ears, or slightly reddish pants isn't copyright infringement, then sure, neither is any of this stuff.
Isn't this already well-established? For example, this image, used in The Simpsons:
Amazing. It’s so obvious I wonder why billion dollar corporations didn’t figure out the legal implications of these models yet. Do you have an email address I could pass on to OpenAI?
If I take a JPEG of Mickey Mouse and then turn it into a PNG, it's not a "copy", as the bits are different. But it still contains copyrighted material.
You can try to argue that the bits of PNG itself aren't an image of Mickey Mouse, but rather the algorithm that reads the PNG produces an image of Mickey Mouse. But that isn't really a relevant distinction in so far as copyright is concerned.
In addition, this statement is false:
> The important distinction is that the model does not contain a copy of a copyrighted material.
It has been shown repeatedly that the model produces copies of training data. The copies are of course not stored as JPEGs/PNGs in the model, but they are retrievable from the model, given the correct password (prompt).