| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lt 969 days ago
	A diffusion model can't make a copy. That's the whole point. The original Picasso isn't in the model weights. It has learned to make pixels a particular color to mimic that style, but that's it.

1 comments

kranke155 969 days ago

If the model didn't learn anything important from Picasso, it wouldn't be in the training data.

This whole argument of "ah but it doesnt really need it" doesn't hold up. If the model didn't need it, it wouldn't have used it in the first place.

Same thing in Artstation. It was of course propitious for AI scientists to find such a lovely database of high quality imagery, and all so helpfully tagged into categories.

All they had to do was take it.

link

shkkmo 969 days ago

> If the model didn't learn anything important from Picasso, it wouldn't be in the training data.

> This whole argument of "ah but it doesnt really need it" doesn't hold up. If the model didn't need it, it wouldn't have used it in the first place.

I haven't seen anyone making this argument. There's a pretty clear difference between learning something from an image and memorizing it.

There also isn't any illegal with memorizing an image and painting a reproduction. What you aren't allowed to do is sell or distribute that reproduction without a license.

I think it makes more sense to restrict what people are allowed do with ML tools than to restrict what ML tools can do.

link

lt 969 days ago

Of course it learned, that's the point of training.

You claimed the model can reproduce an image from that training data. That's false, and what the judge dismissed.

  “none of the Stable Diffusion output images provided in response to a 
  particular Text Prompt is likely to be a close match for any specific image in  
  the training data.”

  “I am not convinced that copyright claims based a derivative theory can 
  survive absent ‘substantial similarity’ type allegations,” the ruling stated.

Whether using copyrighted data to train a model is fair use or not is a different discussion.

link