|
|
|
|
|
by jay-anderson
3571 days ago
|
|
Any suggestions on where to start learning how to implement this? I understand some of the high level concepts (and took an intro AI class years ago - probably not terribly useful), but some of them are very much over my head (e.g. 2.2 Softmax Distributions and 2.3 Gated Activation Units) and some parts of the paper feel somewhat hand-wavy (2.6 Context Stacks). Any pointers would be useful as I attempt to understand it. (EDIT: section numbers refer to their paper) |
|