|
|
|
|
|
by dheera
817 days ago
|
|
I've done a bunch of experiments on my own on the Stable Diffusion VAE. Even when going down to 4-6 bits per latent space pixel the results are surprisingly good. It's also interesting what happens if you ablate individual channels; ablating channel 0 results in faithful color but shitty edges, ablating channel 2 results in shitty color but good edges, etc. The one thing it fails catastrophically on though is small text in images. The Stable Diffusion VAE is not designed to represent text faithfully. (It's possible to train a VAE that does slightly better at this, though.) |
|