| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by astrange 1478 days ago

> One of my favorite examples is the classification model that will identify an apple with a sticker on it that says “pear” as a pear—it makes sense, but is still surprising when you first see it.

That classification model (CLIP) is the first stage of this image generator (DALLE) - and actually this shows that it doesn't think they're exactly the same thing, or at least that's not the full story, because DALL-E doesn't confuse the two.

However, other CLIP guided image generation models do like to start writing the prompt as text into the image if you push them too hard.