Hacker News new | ask | show | jobs
by jcjohns 3522 days ago
Real-time neural style transfer is not new; in the past year there have been several academic papers [1-4] on this topic and several open-source code releases:

https://github.com/jcjohnson/fast-neural-style

https://github.com/DmitryUlyanov/texture_nets

https://github.com/chuanli11/MGANs

Neural style blending is also not new; I did it more than a year ago using optimization-based method:

https://github.com/jcjohnson/neural-style#multiple-style-ima...

The novelty of this work is a clever way for training a single network that can apply many different styles; existing methods for real-time style transfer train separate networks per style. Their method also allows for real-time style blending, which is very cool and to my knowledge has not been done before.

(Disclaimer: I'm the author of [2])

[1] Ulyanov et al, "Texture Networks: Feed-forward Synthesis of Textures and Stylized Images", ICML 2016

[2] Johnson et al, "Perceptual Losses for Real-Time Style Transfer and Super-Resolution", ECCV 2016

[3] Li and Wand, "Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks", ECCV 2016

[4] Ulyanov et al, "Instance Normalization: The Missing Ingredient for Fast Stylization", arXiv 2016

3 comments

Thanks for releasing your code for this! Also really enjoy your cs231n lectures on youtube.

There is some grain/small repeating pattern in many of the style transfer pictures I've seen - could it be an artifact of this? http://distill.pub/2016/deconv-checkerboard/

Yes, I think that is a likely explanation. Also note that Vincent Dumoulin is an author of both the deconv-checkerboard blog post and the new paper from Google, and that the new Google paper uses the upsample+convolution technique suggested by the deconv-checkerboard blog post.
Hey jcjohns, fan of your work.

I've noticed that your project for fast-neural-style does instance normalization over batch normalization.

Batch normalization has the benefit that you can merge the gamma & beta into a convolutional layer on the forward pass, which makes it a lot faster by allowing you to skip a step when building the styled images using a trained model.

Can the same be done with instance normalization? I didn't see a formula in the paper but I would think so, since they are fairly closely related.

I've found that instance normalization usually gives better results so I prefer it over batch normalization.

With batch norm you learn four scalars per convolutional feature map: mu (mean), sigma (stddev), alpha (scale) and beta (shift). During training, mu and sigma are estimated from data statistics; during testing they are constants, either estimated from the entire training set or computed as a running mean during training. At test time the batch norm operation is then alpha * (x - mu) / sigma + beta, which is a linear operation since everything but x is constant; since it is linear it can be merged into a convolutional layer.

With instance norm, mu and sigma are estimated from data statistics during both training and testing; this means that the test-time forward pass is nonlinear, so it cannot be merged into a convolution (which is linear).

Awesome, thanks for your response!
Also check out Prisma app for a slick mobile-friendly UI & subset of pre-selected art styles. (layman's application of this tech)
Don't forget everyone that is not Prisma that does the same thing, sometimes better.

Here's mine. It runs on Mac & Windows, runs locally on all its styles unlike Prisma, runs HD images, and can process unlimited video: http://macdaddy.io/Style/

Prisma's Instagram account is fantastic - https://www.instagram.com/prisma/