|
|
|
|
|
by themoonisachees
613 days ago
|
|
Not really surprising. Vector embeddings are really not that great at conveying arbitrary "without"s. The words the model sees are "alligator", "tail", and "without", but without means nothing. If something is in the prompt, it should be drawn, so it's going to make extra sure there is a tail in the image. The exception is when it's common to refer to something that has an element removed, for example, a french king without a head. There are some prompting software that allows you to negatively specify certain words, which is useful for example if you want a picture of a mustang, the horse. You can specify negative: car, and the model will avoid diffusing into anything looking like a car, but you can't get that level of control from chatgpt. |
|
That's an old approach used in SD 1/2 level solutions - for gpt that answer is incorrect/outdated. We've moved past that approach. New models use sentence embeddings which can represent meaning beyond individual words - for example Flux uses T5. OpenAI has been using some form of that for quite a while.