|
|
|
|
|
by siraben
1253 days ago
|
|
What datasets were you using, how large was the model and what was the noise schedule? I’ve been contemplating implementing my own from scratch as well. I’m surprised that training with conditional labels did not help as much. |
|
I'm just expressing here that my expectation was that this method would be less finicky than GAN because it uses an MSE loss, but unfortunately it seems to have its own difficulties. No silver bullet, I guess. The integration sampling can be quite sensitive to imperfections and diverge easily, at least in early stages of training.
I decided to write this because it feels like the early days of GAN where overall there seems to be lots of these "explain diffusion from scratch" type articles out there, but not yet a lot discussing common pitfalls and how to deal with them.