| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by afp14 832 days ago

So here's my hypothesis as to how it works now - and yes I believe this "problematic bias" is baked into the core model - I think they have a "dumb" simple "is this a person" detector stitched onto the end of the pipeline. If it detects a person, drop the generated images.

This suggests a jailbreak, if you can fool the "dumb" person detector (likely shallow vs deep learning) with your deep learning generator, the images pass through. I had some success last week with "upside down" or "standing on their head" though it may have been patched. You could consider occlusion "wearing a mask" etc. Basically find phrasing that would fool a naive person/face detector in the corresponding image.

As for the model, I truly believe it to be paradoxically pro-white racist. Imagine any stereotype for any race. You say "generate <stereotype> of a <historically white character/role>" and the model swaps out the white race for the stereotyped race, seemingly embracing racist stereotypes. On the other hand, for white stereotypes, they are much harder to produce since the model is hesitant to render white folks.