| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by zschuessler 3581 days ago

Fun fact: this won't tell you which bad word. Google's implementation of the speech API returns asterisks for words it deems bad. See:

https://github.com/knpwrs/grumbles.js/blob/master/src/grumbl...

It's also interesting to see which words Google determines is bad, and which they mysteriously don't. The API does real time processing of sentence structure and will return "<three asterisks> on me" and "cum to the park" correctly, based on intent. (Sorry for the offensive speech!)

For a side project I needed to find every single English word/phrase the API would filter. Stumbled upon that in amazement.

(Side note: speaking a long list of bad words into a microphone very slowly was the most fun QA I've done)

5 comments

harperlee 3581 days ago

I get that this is a free service and all, but I find that ridiculous. They are basically crippling the functionality of a service that is global and it is not aimed at a particular application, based on a very localized interpretation of what is nice to say and what is not... this should be controlled at the application level, the API providing at most hints about the tone.

Depending on what you use it on, this could render the service useless. Imagine using it to, I don't know, trying to identify Pulp Fiction sentences against a corpus of scripts. It would fail spectacularly.

Another example context on where this could fail very quickly is when considering people from other languages, e.g., if I'm not wrong, saying "Jesus!" might be impolite in (some contexts of) the US. In Spain, we say "Jesus!" when you sneeze, instead of "Bless you!" (and, in general, we are outrageously foul-mouthed compared to the US).

link

harperlee 3581 days ago

By the way, I can't edit my comment any longer, but when I said:

> In Spain, we say "Jesus!" when you sneeze, instead of "Bless you!" (and, in general, we are outrageously foul-mouthed compared to the US).

..it may sound as if "¡Jesús!" ("Bless you!") is foulmouthed - when in fact is something a four-year-old would typically say.

link

c22 3581 days ago

I always found this hilarious, my phone won't let me swear in a text I am sending using voice to text but it will gladly boom "fuck" over my car's Bluetooth when someone sends me a text with swear words.

link

Houshalter 3581 days ago

Since the voice recognition isn't perfect it means there is a chance it could make false positives and write a bad word you didn't say. Then people complain to Google or sue them. Similarly google search won't auto suggest offensive words or certain libelous statements (e.g. "X is a criminal", even if they are, and even if its a common search phrase.)

link

slededit 3580 days ago

Sue them for what? Hurt feelings?

link

kpthunder 3581 days ago

Yeah, like I said in response to another comment:

> My original idea was going to involve a dependency on one of the bad words lists available on npm, but then I saw the API censors said words and thought, "Oh, that's easy."

link

bfuller 3581 days ago

I really wish I could turn this "feature" off. Not only is it annoying but it clearly shows people when I am using voice to text and when I am typing.

link

atomical 3580 days ago

You can turn it off on Android.

link

justinlardinois 3581 days ago

Any reason it's "cum" and not "come to the park"?

link

jdbernard 3581 days ago

I think he's trying to show that the speech API is smart enough to understand the context of the phrase and is not just blindly replacing words based on spelling.

link

Digit-Al 3581 days ago

I wonder how it would handle "I've come in my shorts today" vs "I've cum in my shorts today".

link