Hacker News new | ask | show | jobs
by kauguste281 1523 days ago
Can anybody explain "Gollum writes his autobiography"[1]? The images themselves look extremely well "rendered", good lighting and all and they capture the description quite well. But the "Gollum" in the images doesn't look like any common version of Gollum I could find. Google Image Search and most other places are completely flooded with the movie version of Gollum, which looks very different. There are animated versions that look a little closer, but nothing I could find looks like the images produced by the AI.

I'd love to search through the training data to figure out what is going on here, but apparently that isn't public available either.

[1] https://twitter.com/Merzmensch/status/1513611885576347658

3 comments

They did some filtering to make it difficult to generate certain things like people, celebrities, nsfw, etc. This has unpredictable consequences on downstream tasks, particularly if the filtering is aggressive and removes false positives.
Yes, here is a possible explanation: it face swaps.

Some observations:

Note that the writing utensil is always in the right hand. It is more evident after the first image that it is neither a pen or a feather or anything like that but a whispy blurry line that goes nowhere.

The book pages are always blank.

All the creatures are green and in the same pose.

The arms are wrong and often disconnected from the hands.

I believe the way to look at DaLL-E 2 output is to break the prompt and the resulting image into distinct concepts/layers.

Each layer is cribbed from some pre-existing image. Hands from here, arms from there, head from Yoda, swap the face. Blur everything together with a style transfer.

Gollum is presumably considered conceptually close to goblin. All of these are clearly goblins.