Hacker News new | ask | show | jobs
by marc_omorain 3696 days ago
This reminds me of Shutterstock's open-source List of Dirty Naughty Obscene and Otherwise Bad Words:

https://github.com/shutterstock/List-of-Dirty-Naughty-Obscen...

7 comments

This is the most beautiful commit message I've read in a long time.

https://github.com/shutterstock/List-of-Dirty-Naughty-Obscen...

Now what exactly is the thing it refers to?

An exceedingly overweight and possibly unclean individual, according to http://nb.urbandictionary.com/define.php?term=shit%20blimp
Interesting. "anal" is a bad word in English, but not in German. On the other hand, "naked" is a bad word in German, but not in English.

Maybe I should send a pull request...

It may that they did not add it for a reason also. There are a lot of edge cases because many potentially dirty concepts are made up of words that are not bad alone. For example a text can have both "girls" and "nude" in it without being vulgar, but if it has the phrase "nude girls" the chance for it being pornografic is much higher.

( Searchdaimon have done some research on this and have a list if anyone is intrested: https://github.com/searchdaimon/adult-words )

There is also the data analysis perspective: http://qr.ae/8W4Pz1 :)
And they took down that repo.
Some of the finnish ones:

jätkä - meaning literally "dude" hatullinen - hatful lahtari - an outdated word; I doubt many young people know the meaning/context of it. It was an insulting way to call the people on the white side during our civil war in 1918 pehko - thick hair

黒人 (black person) is on the Japanese list. I wasn't aware that that was a naughty word.
This wins repo of the day!
Having been on several projects that required such lists, I'm glad that such a repo exists, but be aware that it is just the tip of the iceberg. Stakeholders emerge from the woodwork: such and such on the delivery team had a terrible experience before so we should include these words, HR and marketing have company-specific lists to merge in, some of the producers have unsettlingly precise and revelatory requirements ...
TIL the word "anilingus." Thanks!