|
|
|
|
|
by westoncb
1531 days ago
|
|
> Composite those things together manually and add a style transfer you'll get similar results to DALL-E as that is what it is doing more or less. If you try actually doing this it will be trivial to see that this assertion is incorrect. 1. The way in which the elements of the images are integrated together is deeper than the level of style. For instance, see the image in the top row, second column: it has integrated the blue bird wings onto the man, not only simply grafting them on, but giving the appearance of their being draped on like a cloak, partly behind and partly in front of him (+ it's consistent with the man's posture and the rays of light to evoke a certain coherent cultural idea/image). You might be able to integrate multiple images (of man, bird, rays etc.) together and style transfer to arrive at a poor approximation of this—but even then, the decision to place the elements together in such a way would require creativity on your part. 2. The one example set of of trial images (generated from the phrase "expressive painting of a man shining rays of justice and transparency on a blue bird twitter logo") is one of the easiest among the full group to pick its various elements apart; if you try this thought experiment with the others in the thread, you'll see this idea is by far insufficient. |
|
> the decision to place the elements together in such a way would require creativity on your part
I strongly suspect that's because it found similar compositions in its training set. So what exactly is going on here is fascinating.
Did it learn compositing? Is that why the image output is now much more stable? Or is it mearly finding similar artwork and competently recreating/mimicking existing compositions from different building blocks? So now we can not only transfer styles but also transfer compositions. That could be the beginning of something useful. Instead of a text prompt I'd give it my crappy doodle and it will respond with an improved/different one that is comparable (also a great way to steal tho).
And of course I picked the one that is easiest to tease apart where it is most evident so people will see what I mean.
> if you try this thought experiment with the others in the thread, you'll see this idea is by far insufficient
That depends on your imagination and your artistic eye I guess. Even if somebody could do that they certainly couldn't make you believe them. That's the accomplishment.
Neither one of us can prove it one way or the other so long as the model is a black box. And certainly so long as we don't have direct access to openai but just to curated examples.