|
|
|
|
|
by visarga
3514 days ago
|
|
Style transfer is part of a new trend that is concerned with generation of content. It is very difficult to generate images or text because the space of possible shapes/messages is infinite and highly dimensional. We know how to classify in 1000 categories (which corresponds to generating tags from a set of 1000 choices) but when it comes to painting, it requires to select a combination of pixels from a much much higher dimensional space. Hence, the difficulty. But I think that generating in high dimensional spaces, such as in translation, style transfer, gameplay and robotics is the most interesting part of AI. It is what makes AI appear more intelligent and creative to us. AlphaGo was impressive because it could select movement sequences from a space of 10^120 possible combinations (compare that with an ImageNet classifier that outputs from a space of 10^3 labels). So, in conclusion, it is essential to learn to generate images, text, sounds and behavior or movement that are just as complex and coherent as those created by humans. Being able to do so would mean half the way to AGI would be achieved, we could have talking moving robots that are not lame. Remember the latest text to speech engine from DeepMind - that's speech generation from a higher dimensional space. It shows the difference compared to regular TTS. |
|