| HN Mirror

That's interesting, but I'm not sure it works. I think that works out to "for any given prompt, distribute credit to every source image that has a keyword that appears in the prompt, proportional to how many other source images had that same keyword".

If I include the tag "floor", do I get some (tiny) percentage of every image that uses "floor" in the prompt, even if the bits from my image did not end up affecting model weights much at all in training?

Worse, for tags like "dramatic lighting", it's likely that the important source images will depend on the other words in the prompt; "sunset, dramatic lighting" will probably not use the rely on the same weights or source images as "theater interior, dramatic lighting".

And then you get the perverse incentives to tag every image with every possible tag :)

I'd love to be convinced otherwise, but I'm not seeing prompt-to-tag association working.