| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by joefkelley 3663 days ago
	Why not train the RNN to directly predict emojis, instead of projecting everything to semantic space and picking the closest emoji? Seems like that would help with the problem of emojis with multiple meanings in different contexts. With this model, they could only be in a single point in semantic space.

1 comments

spaky 3663 days ago

The RNN does in fact directly predict emoji. It outputs a vector of length 1624 (the number of emoji) containing the score associated with each emoji given the input text. This vector of probabilities is what can be though of as the point in semantic space.

The issue of multiple meanings is that if you strongly predict an ambiguous emoji (say the prayer emoji) how do you then extrapolate what concept is contained in the sentence (e.g. was the person saying "thanks" or "high five" or "please").

[I'm also a Dango dev]

link

wxs 3663 days ago

So yeah: we can focus on vectors at different levels of the net and these are in some sense different semantic spaces. In the article I talk about a level immediately before it projects onto the emoji vectors. If you look at the output after the projection (and do a softmax) you get a probability distribution across all emoji. This would be a different space in which each axis is an emoji, rather than the emoji being points distributed around the space.

link

joefkelley 3663 days ago

Awesome, thanks for clarifying. So does the training optimize some property of the "semantic" layer immediately before the final emoji prediction layer? Or does it just optimize accuracy of emoji prediction directly?

And then the t-SNE projection shown in the article is based on this same layer (one before prediction)?

link

wxs 3663 days ago

Well those are sort of equivalent. But yeah, we use cross-entropy between the projected output and the target emoji distribution as our objective to minimize.

And yes, we do the t-SNE on that pre-projection space. That's why we can visualize the targets (emoji) in it. We can also t-SNE the word embeddings themselves — the input to the RNN — which is also kind of interesting. It automatically learns all kinds of structures there. Chris Olah has a good post on word embeddings if you're interested: http://colah.github.io/posts/2014-07-NLP-RNNs-Representation...

link