Hacker News new | ask | show | jobs
by simbolit 991 days ago
I am sorry, but I really don't understand.

With the same seed, and an extremely similar prompt, why would you get an entirely different image?

If I take seed 9999999 (just example) and my prompts are

(1) "very large gothic church at dusk, spooky, horror, red roses" and

(2) "very large gothic church at dusk, spooky, horror, white roses"

then with all models I tested over the last year or so, you get _very_ similar images, with different colored roses, and (at most) very minor changes eleswhere. this only seems to work if you keep in mind the prompt being parsed left to right, so changes further to the beginning of the prompt have larger effects. Again, of course, you need the same seed.

But, with this said, why would that be any different with plain/full/rich text. Apologies if I am somehow blinkered and asking something really obvious.

1 comments

Yup, it could be similar, but it mostly only works for very simple prompts (e.g., one subject in the image).

For example, in Figure 11 of the paper (https://arxiv.org/pdf/2304.06720.pdf), you can see that full-text "rustic cabin -> rustic orange cabin" does not turn the cabin orange.

For coloring, the core benefit of our method is that it allows precise color control. For example, it can generate colors with rare names (e.g., Plum Purple or Dodger Blue) or even particular RGB triplets that we cannot describe well with texts.

You can examples in Figure 4 here: https://arxiv.org/pdf/2304.06720.pdf