|
|
|
|
|
by IYasha
1254 days ago
|
|
Yes, the proof... Actually, there must be some diff tool to compare models before and after processing some source? I'm not sure, but it must be possible to detect pieces of come ingested data in the model itself. I've seen the famous "wolf misdetection" investigation screenshots, when the AI, apparently labeled a dog as a wolf because there was snow around on the picture. |
|
There are probably pathological cases where a repeating image is more strongly overfit in the training data and could be reproduced in much more detail than this average though. But the systems learn similar to the human brain, they learn the gist of a style or scene and how it relates to words. It's not a search engine, it doesn't copy/paste any block of pixels...
One interesting example is that since SD's original training set included some stock photo watermarked images, it learned that there was a concept of watermark, which can end up in the middle of generated images. Not in an intelligible way, but you can see roughly how it interpreted this detail. And in those cases you DO have a very very repeating similar pixel bitmap in the training data.