|
|
|
|
|
by millimeterman
1168 days ago
|
|
My uninformed speculation is that this might be an artifact of the embedding layer (e.g. CLIP), not the image training data. Presumably, the dataset for training the embedding layer (which is trained separately and then fixed) is not stripped of copyrighted content. So it will have learned that "Pikachu" is related to words such as "yellow" and "rat". Therefore, even if the image dataset didn't have a single picture of Pikachu, the image generator will still likely produce a yellow rat. Just one that doesn't actually resemble Pikachu. |
|