If the model didn't learn anything important from Picasso, it wouldn't be in the training data.
This whole argument of "ah but it doesnt really need it" doesn't hold up. If the model didn't need it, it wouldn't have used it in the first place.
Same thing in Artstation. It was of course propitious for AI scientists to find such a lovely database of high quality imagery, and all so helpfully tagged into categories.
> If the model didn't learn anything important from Picasso, it wouldn't be in the training data.
> This whole argument of "ah but it doesnt really need it" doesn't hold up. If the model didn't need it, it wouldn't have used it in the first place.
I haven't seen anyone making this argument. There's a pretty clear difference between learning something from an image and memorizing it.
There also isn't any illegal with memorizing an image and painting a reproduction. What you aren't allowed to do is sell or distribute that reproduction without a license.
I think it makes more sense to restrict what people are allowed do with ML tools than to restrict what ML tools can do.
Of course it learned, that's the point of training.
You claimed the model can reproduce an image from that training data. That's false, and what the judge dismissed.
“none of the Stable Diffusion output images provided in response to a
particular Text Prompt is likely to be a close match for any specific image in
the training data.”
“I am not convinced that copyright claims based a derivative theory can
survive absent ‘substantial similarity’ type allegations,” the ruling stated.
Whether using copyrighted data to train a model is fair use or not is a different discussion.
This whole argument of "ah but it doesnt really need it" doesn't hold up. If the model didn't need it, it wouldn't have used it in the first place.
Same thing in Artstation. It was of course propitious for AI scientists to find such a lovely database of high quality imagery, and all so helpfully tagged into categories.
All they had to do was take it.