|
|
|
|
|
by baristaGeek
3331 days ago
|
|
That's a great explanation. Let me add this though: Artificial neural networks were proposed to compute the probability of a sequence of words occurring; however, RNNs were the next step in Natural Language Processing since they allow variable-length sequences to be received as an input contrary to the previously proposed architecture. However a simple RNN architecture didn't allow for long
-term dependencies to be captured (that is, use statistical modeling to predict a word sequence on a part of a text that is based on an idea previously developed on the corpus). So two kinds of fancy RNN architectures were developed to tackle this problem: GRUs and LSTMs. Production systems are already implementing these architectures and they are yielding pretty accurate results. But now Facebook researchers are proposing using CNNs for this task because this architecture can take more advantage of GPU parallelism. |
|