| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by in-silico 116 days ago

I think some of the visualizations would be much better if you used a pixel-space model instead of a latent diffusion model.

Right now we are only seeing the denoising process after it's been morphed by the latent decoder, which looks a lot less intuitive than actual pixel diffusion.

If you can't find a suitable pixel-space model, then you can just trivially generate a forward process and play it backwards.

1 comments

whilefalse 116 days ago

Thanks that’s a great suggestion.

link