Hacker News new | ask | show | jobs
by salik_syed 3518 days ago
As an artist I find it very frustrating when people try to apply style transfer type techniques in an attempt to emulate an artist like Picasso. It kinda works and generates a bunch of hype but it isn't even close. The reason it's frustrating is because I think Deep Learning is actually capable of doing this stuff but the people implementing need to understand how Picasso actually did his work.

If you look at cubism the whole idea is to capture multiple sides of a 3-Dimensional object at once. A lot of art is not a "style" but rather a projection from 3D (or 4D) space to 2D space.

If you wanted to paint a "dog" in the style of Picasso your network would need to understand the geometry of a dog.

Training on a bunch of 2D before and after training examples is underspecified.

It's important to understand that it is a mapping from 3D -> 2D ... NOT 2D->2D

Another example is "Nude Descending a Staircase" by Duchamp: https://en.wikipedia.org/wiki/Nude_Descending_a_Staircase,_N...

It is a painting describing motion. To apply style transfer would be completely stupid because the point of the image is to project 4D->2D ... not to have wavy black and brown lines.

4 comments

Unfortunately, I feel like it is the case that much current DNN work is predicated specifically around not understanding the problem, in a kind of Skinnerian rejection of GOFAI; the hope is that the signal in the data is strong enough that the statistical learning will 'understand' it for you, and all you need to worry about are tweaking hyperparamters until it clicks.

To the point of your concern, for various, and likely numerous reasons, this does not always seem to occur.

To translate that kind of conceptual aesthetic logic into an algorithm, the programmer essentially needs to become the artist: make subjective creative decisions about the style to achieve, and enshrine those into code. And (as dmreedy wrote in a sibling comment) that's specifically the kind of "old-school" AI approach the current DNN-based work is trying to avoid.

I'm not as optimistic as you that the current statistics-driven approaches could ever reach the kind of deep analytic modeling that would be required for a style transfer system to be able to look at a Picasso and infer that there's a 3D->2D mapping at play... And it's a very interesting thought because (to me) it seems to demonstrate how far we are from actual AI that could make that kind of inventive conceptual leap.

What data does an artist consider when he paints? He does sort of an optimization procedure very similar to what something like Deep Dream does. But rather than doing response optimization to make random-noise more "dog-like" or "cat-like" or "human-like" (as Deep Dream does) the optimization is done to evoke a certain feeling within the artist himself. To create more extreme feelings than just a photo-realistic rendering.

The mapping between feeling and images are correlated to each other through experience. Certain images are fundamental to human experience and the human brain through evolution( a mother smiling, scary monsters). Others are learned (ever been hit by a car? bet that every time you see that exact model and color of car you'll feel an emotion)

Here's a thought experiment:

What if we fed the deep learning "painter" tons of 3D animation. Each point in time would be a full 3D Scene. Each point in time would be labelled with emotions "scary", "happy" , "angry"

I bet the algorithm could generate original art and learn new artistic styles by maximizing response to certain permutations of feelings.

Learning from video is researched in many papers. The way video data is structured, it allows for identification of new objects by comparing consecutive frames. It creates a "model of the physical world" that can predict the future a few time steps ahead. It is being used to identify activities and to help plan robotic movements.
I think for cubism, the easiest way would need to create the "original" painting/image before cubism transformation and learn the transformation.

Ideally several such images to not overfit on the specific details of that transformation.

The other way is to get a huge dataset of cubism still life paintings and also a dataset of a bunch of photographs of still life and learn the "average" transformation from that. Although such transformation may not generalize to other subjects and might only work well with flowers/food on a table.

Same thing with the other styles. For example (NSFW), photographs of naked women such as https://www.daniel-bauer.com/images/art_nudes/15_artistic_nu... transformed into classical https://www.google.com/culturalinstitute/beta/asset/the-birt... Here you would first identify the people objects and then learn the transformation of both people and backgrounds.

Still, the current approach works fine for things like starry night because of the nature of the painting.

So how about showing us a demo of what you mean? (from the playbook wherein "he who criticizes, volunteers")

Yes, most people working on deep learning are making small incremental improvements, and yes it's a little tiresome to see each one trumpeted as some big advance.

But its really hard to make fundamental advances. Which shouldn't stop you from working on it.

Agree that it is easy to be a critic ... I am indeed working on it :)

That being said -- my goal is to inform of a better approach rather than criticize.