|
|
|
|
|
by astrange
725 days ago
|
|
Model collapse was basically a coping idea made up by artists who were hoping AI image generators would all magically destroy themselves at some point; I don't think it was ever considered likely to happen. It does seem to be true that clean data works better than low quality data. |
|
Model collapse itself is(was?) a fairly serious research topic: https://arxiv.org/abs/2305.17493
We've by now reached a "probably not inevitable" - https://arxiv.org/abs/2404.01413 argues there's a finite upper bound to error - but I'd also point out that that paper assumes training data cardinality increases with the number of training generations and is strictly accumulative.
To a first order, that means you better have a pre-2022 dataset to get started, and have archived it well.
but it's probably fair to say current SOTA is still more or less "it's neither impossible nor inevitable".