Hacker News new | ask | show | jobs
by nl 2946 days ago
Non-English languages have different challenges. Typically the training data is much less, but the language itself has fewer exceptions to rules.

> How would the system handle something like "Your service is the sh* t!"

This is pretty easy to handle correctly with sufficient training data. A good demonstration is the deepmoji sentiment predictor: https://deepmoji.mit.edu/

Try:

Your service is sh* t!

Your service is shit!

Your service is the sh* t!

Your service is the shit!

Works pretty much perfectly.

Edit: how am I supposed to escape the * without leaving a space after it!?