|
|
|
|
|
by astrange
1478 days ago
|
|
> One of my favorite examples is the classification model that will identify an apple with a sticker on it that says “pear” as a pear—it makes sense, but is still surprising when you first see it. That classification model (CLIP) is the first stage of this image generator (DALLE) - and actually this shows that it doesn't think they're exactly the same thing, or at least that's not the full story, because DALL-E doesn't confuse the two. However, other CLIP guided image generation models do like to start writing the prompt as text into the image if you push them too hard. |
|