Hacker News new | ask | show | jobs
by l33tman 862 days ago
Just a terminology comment here. "Latent space" means a lot of different things in different models. For a GAN for example it actually means the "top concept" space where you can change the entire concept of the image by moving around in the latent space, which is notoriously difficult. For SD/SDXL it refers to the bottommost layer just above pixelspace, which expands the generated image from 64x64 to 512x512 pixels in the case of SD1.5.

This allows the rest of the network to be smaller while still generating a usable output resolution, so it's a performance "hack".

It's a really good idea to explore it and hack into it like in the article, to "remaster" the image so to speak!