|
|
|
|
|
by minism
1350 days ago
|
|
Great overview, I think the part for me which is still very unintuitive is the denoising process. If the diffusion process is removing noise by predicting a final image and comparing it to the current one, why can't we just jump to the final predicted image? Or is the point that because its an iterative process, each noise step results in a different "final image" prediction? |
|
To generate an image we start with a noise sample and take a step towards the _mean_ of the distribution of real images which would produce that noise sample when running the forward diffusion process. This step moves us towards the _mean_ of some distribution of real images and not towards a particular real image. But, as we take a bunch of small steps and gradually move back through the diffusion process, the effective distribution of real images over which this inverse diffusion prediction averages has lower and lower entropy, until it's effectively a specific real image, at which point we're done.