|
|
|
|
|
by Udik
3660 days ago
|
|
There is something that escapes me regarding this very cool neural style transfer technique. One would expect it to need at least three starting images: the one to transform, the one used as a source for the style, and a non-styled version of the source. This last one should give the network hints on how to transform the unstyled version in the styled one. For example, what does a straight line end up being in the style? Or how is a colour gradient represented? Missing this, it seems that the neural network should be able to recognize objects in the styled picture, and derive the transformation applied based on a previous knowledge of how they would normally look like. But of course the NN is not advanced enough to do that.
Can someone explain me roughly how does this work? |
|
I believe that this is done using Restricted Boltzman Machines[1] trained with the stylised image.
Think of it as a network that receives an image on the input layer, sends it to one or more hidden layers with less nodes (like an auto-encoder), and then tries to reconstruct the image on the output nodes. This is like a lossy compressor-decompressor overfitted to the stylized image.
Now, just pass the real image as an input to your network and the output should be a stylized version of the input.
[1]http://deeplearning4j.org/restrictedboltzmannmachine