|
|
|
|
|
by angusturner
1383 days ago
|
|
Developing models that can predict if stuff is harmful ironically makes it easier for people to optimize for harm. e.g. the one line of code in Stable Diffusion that predicts if stuff is NSFW, can be inverted to generate only NSFW stuff. I tend to agree with OP that there is no technical solution to this problem. |
|