|
|
|
|
|
by yorwba
333 days ago
|
|
> I used it myself in talk almost 3 years ago! And it isn't a lie exactly, the linked paper is totally sound. The paper was published in December last year and addresses your concerns head-on. For example, from the introduction: "if the network can learn
this ideal score function exactly, then they will implement a perfect reversal of the forward process. This, in turn, will
only be able to turn Gaussian noise into memorized training
examples. Thus, any originality in the outputs of diffusion
models must lie in their failure to achieve the very objective they are trained on: learning the ideal score function.
But how can they fail in intelligent ways that lead to many sensible new examples far from the training set?" Their answers to these questions are very good and also cover things like correcting the output of previous steps. But the proof is in the pudding: the outputs of their alternative procedure match the models they're explaining very well. I encourage you to read it; maybe you'll even find a new way to decompose images into surface material properties and lighting as a result. |
|
And I was impressed by the close fit to real CNNs/ResNets and even to UNets. But what that shows is that the real models are heavily overfit. The datasets they are using for evaluation here are _tiny_.
Edit: oh the talk is here btw, if anyone is curious https://youtu.be/c-eIa8QuB24