|
|
|
|
|
by baalimago
2487 days ago
|
|
>Next we shall take a moment to remember the fallen heros, without whom we would not be where we are today. I am, of course, referring to the RNNs - Recurrent Neural Networks, a concept that became almost synonymous with NLP in the deep learning field. XLNet (https://arxiv.org/abs/1906.08237) is in essence a recurrent neural network, using a transformer (which is based on neural networks) which recurrently keeps context between different batches. But the gated RNN's, such as AWD-LSTM/GRU, are fading out to the superior transformer architectures, this is true. That's my only complain though, excellent theoretical introduction. Although, if anyone wanted to actually implement a transformer, be ware that you want to have a 8+ GB GPU unit available, or be prepared to use cloud computing (Google Colab is free, for now). Training neural networks is quite hardware dependent still. |
|