It's not that simple. The model was not trained to recognize "harmful" action such as blowjobs (although "bombing" and other atrocities of course are there).
The model was trained on eight specific body parts. If it doesn't see those, it doesn't fire. That's 100% of the job.
I see that you've managed to name things that you think aren't in the model. That's nice. That's not related to what this company did, though.
You seem to be confusing how you think a system like this might work with what this company clearly explained as what they did. This isn't hypothetical. You can just go to their webpage and look.
The NSFW filter on Stable Diffusion is simply an image body part recognizer run against the generated image. It has nothing to do with the prompt text at all.
The company filtered the LAION 5b based on undisclosed criteria. So what you are saying is actually irrelevant, as we do not know what pictures were included or not.
It is obvious to anyone who bothers to try - have you? - that a filter was placed here at the training level. Rare activities such as "Kitesurfing" produces flawless, accurate pictures, whereas anything sexual or remotely lewd ("peeing") doesn't. This is a conscious decision by whoever produced this model.
The model was trained on eight specific body parts. If it doesn't see those, it doesn't fire. That's 100% of the job.
I see that you've managed to name things that you think aren't in the model. That's nice. That's not related to what this company did, though.
You seem to be confusing how you think a system like this might work with what this company clearly explained as what they did. This isn't hypothetical. You can just go to their webpage and look.
The NSFW filter on Stable Diffusion is simply an image body part recognizer run against the generated image. It has nothing to do with the prompt text at all.