Hacker News new | ask | show | jobs
by tonic_section 2098 days ago
There are a couple of solutions which work empirically - as you mentioned, one solution is a dithering-like differentiable relaxation where uniform noise is added, which simulates quantization, or just to ignore the quantization operation when taking gradients, essentially treating it as an identity operation in the backward pass.
1 comments

But how do you optimize the lossless encoding of the quantized latent space? ie. how do you tell the encoder to produce something that can be well encoded, given that the encoding is a bunch of discrete steps.
Usually the lossless encoding is offloaded to a standard entropy coder, e.g. arithmetic, ANS, etc. because these approach the theoretical minimum rate given by the source entropy pretty closely, so there wouldn't be a point building a fancy differentiable replacement.
That makes sense, I don't think I stated my question very clearly: how do you control/optimize the entropy of the latent space?

ie. what stops the network from laundering all of the information for reconstructing the image through a super high entropy latent space that is hard to code but allows it to reconstruct perfectly

e: I guess I should just get up to date by reading some papers

The objective function used in these lossy neural compression schemes usually takes the form of a rate-distortion Lagrangian - the rate term captures the expected length of the message needed to transmit the compressed information and the distortion term measures the reconstruction error. So it wouldn't be able to cheat like in your example, because this would incur a high value of the loss through the rate term.