Hacker News new | ask | show | jobs
by cs702 2974 days ago
Very interesting.

At a high level (ignoring many details) the main idea is to replace generator networks in GANs with Restricted Boltzman Machines, or RBMs, which are easier to train (more stable). The authors call this kind of architecture "Boltzmann Encoded Adversarial Machines," or BEAM for short.

The experiments provide persuasive evidence that BEAMs outperform GANs. Figure 3, in particular, I find very persuasive -- it compares the ability of different architectures to learn to generate low-dimensional mixtures of Gaussians, with BEAMs very clearly outperforming GANs. The results in higher-dimensional applications such as image generation also suggest that BEAMs outperform GANs, but the improvement is somewhat more subjective due to the nature of high-dimensional data. Obviously, these results need to be replicated by others.

It looks promising to me. That said, it's been years since I've touched an RBM -- I only have a vague recollection of how they work and how they're trained, layer by layer, as proposed by Hinton in 2006 or so. Time to re-read old papers!

2 comments

To clarify: in the case of a BEAM both the generator and all but the top layer of the discriminator is replaced with an RBM. The adversary in this case operates on features encoded by the RBM, not raw data samples. Secondly the RBM is trained with a combined loss involving log-likelihood and the adversarial term.
Yes. For simplicity's and brevity's sake, I ignored many important details in my summary.
No worries! :)
Thanks. Have you made any code available online?
Yes; The following recent review article actually provides code samples: https://arxiv.org/abs/1803.08823 which use an open-source version of our software called 'paysage' (https://github.com/drckf/paysage). This has currently not been updated too recently, but we expect to put out a new update quite soon. The update will clean up code, docs, features, but might not yet contain the BEAM training code. The latter is pending some decisions about IP, etc.
Thank you. I'll take a look!
I am not entirely convinced. In particular, the results shown in Fig 7 remind me of the BEGAN paper which was similarly hyped. But I'll defer further judgment until I read through it more and maybe run some experiments.
There's a good reason that the pictures look similar. Both architectures produce somewhat blurry images.

The problem, in BEGAN's case, is that when your idea of similarity is based of mean squared error, high frequency details are just not important. [1] You can see this by doing PCA on natural image patches. BEGAN uses an autoencoder trained on MSE.

RBMs produce blurry images because the architecture is not good at representing multiplicative interactions. You just get splodges of colour.

[1] http://danielwaterworth.com/posts/what's-wrong-with-autoenco...