|
|
|
|
|
by zschuessler
3581 days ago
|
|
Fun fact: this won't tell you which bad word. Google's implementation of the speech API returns asterisks for words it deems bad. See: https://github.com/knpwrs/grumbles.js/blob/master/src/grumbl... It's also interesting to see which words Google determines is bad, and which they mysteriously don't. The API does real time processing of sentence structure and will return "<three asterisks> on me" and "cum to the park" correctly, based on intent. (Sorry for the offensive speech!) For a side project I needed to find every single English word/phrase the API would filter. Stumbled upon that in amazement. (Side note: speaking a long list of bad words into a microphone very slowly was the most fun QA I've done) |
|
Depending on what you use it on, this could render the service useless. Imagine using it to, I don't know, trying to identify Pulp Fiction sentences against a corpus of scripts. It would fail spectacularly.
Another example context on where this could fail very quickly is when considering people from other languages, e.g., if I'm not wrong, saying "Jesus!" might be impolite in (some contexts of) the US. In Spain, we say "Jesus!" when you sneeze, instead of "Bless you!" (and, in general, we are outrageously foul-mouthed compared to the US).