Hacker News new | ask | show | jobs
by dahart 1265 days ago
> What you are talking about is Overfitting.

Not really, though that is another legitimate issue.

I was talking about 1) the fundamental training and inference process, which remembers pixels, not concepts or techniques. Today’s AI learns to create imagery in a fundamentally different way than people do. And 2) image generation AI based on text prompts like Stable Diffusion can easily be asked to reproduce training data by having a prompt that is narrow and specific enough. This is not over fitting, it’s a function of the fact that some inputs are quite unique, and you can use the prompt to focus on that uniqueness.

1 comments

The training process looks at pixels. Gets an impression of the relationships between words and curves in images. But, to say it “remembers pixels” is pretty loaded language that implies copying pixels into the model file.

I’d like to see examples of using SD to copy some specific piece of art that hasn’t been plastered millions of times across the internet. Sure, you can get a decent Mona Lisa knock off. Maybe even a strong impression of the Bloodbourne game cover art marketing material. But, reproducing a specific painting from Rutkowski would be quite a surprise to me.

Hehe adding artist names to the prompt is one of the most common ways people are getting closer to copying. https://lwneal.com/rutkowski.html

Here are the examples you requested: https://techcrunch.com/2022/12/13/image-generating-ai-can-co...

Yes the training process looks at pixels, because that’s all it has. That’s the point. Humans don’t look at pixels, they learn ideas. It’s not in the least bit surprising that AI models shown a bunch of examples sometimes replicate their example inputs, examples are all they have, and they are built specifically to reproduce images similar to what they see, I’m not sure why you consider that idea “loaded”.

Again, naming Rutkowsi invokes an impression of his style. But, copies none of his paintings.

Read the paper. What I found is that a random sampling of the database naturally found a small subset of images that are highly duplicated in the database. Researchers we able to derive methods to produce results that give strong impressions of images such as: a map of the United States, Van Gogh's Starry Night, and the cover of Bloodborne :P with some models and not at all with others. The researchers caution against extrapolating from their results.

> We speculate that replication behavior in Stable Diffusion arises from a complex interaction of factors, which include that it is text (rather than class) conditioned, it has a highly skewed distribution of image repetitions in the training set, and the number of gradient updates during training is large enough to overfit on a subset of the data.