|
|
|
|
|
by syllogism
3505 days ago
|
|
(Author here) Thanks! I'm planning to make two follow up posts, on each of the systems, that go through those details. I blurred them out in this post because I wanted to get across this more abstract story about the data types and transformations. There are lots of good posts about attention mechanisms. The WildML post is good, as is Chris Olah's post. Bidirectional RNNs are a little bit less well covered, but the idea is not too difficult to understand given a single RNN (or LSTM, GRU etc). You should also read the papers :). This is how most people who are doing ML --- including the people building practical things, not researchers --- are staying up to date. Academia is so competitive and writing is cheap relative to experimentation. The deep learning literature is really pretty easy to follow. |
|
What do you think about dilated convolutional encoder/decoder networks [1]? Useful for NLP beyond machine translation?
[1] https://arxiv.org/abs/1610.10099, https://github.com/paarthneekhara/byteNet-tensorflow