| Yes, but I mean it's also wrong... That's not how the diffusion process works. You can pick any number of interesting ways to describe it but if they're technically wrong, it doesn't really matter how poetic they are right? Diffusion models do use random noise. As I understand it, every 'step' is composed of three parts: a) the previous output, b) the latent generated from the prompt and c) random noise. As you move further up, the scheduler changes the weights of a, b, and c that get mixed in. ...but from the article: > The subtle error comes in a misunderstanding about the "randomly generated noise." It's not an error. You're just focusing on what you want to focus on. Let's be 100% blunt: The author of an AI art image is pressing the random generator button. Every time. The output is random. It's not a matter of debate; the initial seed to the diffusion model is random noise. The prompt guides the diffusion process, which basically denoises the random noise added to the image certainly... but saying there's no random component to it is completely and utterly wrong. |
Perhaps that first sentence could be more precise, but by the end of the paragraph the author’s meaning is clear: the court has a misunderstanding about the “randomly generated noise” when it believes there is randomly generated noise in both the pixel and the latent - this is not the case, there is no randomness in the latent, that exact handcrafted prompt picks out a precise spot in the model’s giant table of embeddings, that prompt will always pick out that spot in that model, and the random noise is only on the pixel side of things. The author believes the court has this misunderstanding because the court uses the analogy of “a patron makes a suggestion to an artist”, which is a scenario that DOES have random noise involved in producing the latent (the brain is an inherently noise place; an artist’s brain likely even more so).