Hacker News new | ask | show | jobs
by aerodude 2024 days ago
Having a quick read, they're pointing out that the Bernoulli distribution is only supported for values of 0 and 1 (i.e. it's binary), whereas pixel values for a grayscale image are a decimal value in the interval [0, 1]. When you train a VAE, it's pretty standard to use a BCE loss, but this is wrong because the data isn't binary (i.e. it's not a Bernoulli distribution). They define a continuous analogue of a Bernoulli distribution that is supported in [0, 1], and use this as the loss function for training a VAE, which gives them reconstructions that are closer to the input.
1 comments

Yeah, agree having skimmed the paper now. It seems kinda meh to me: it's obvious that improving the loss function is a good thing to do, and for any real application (ie, not greyscale mnist) you'd never use this binary value target.