|
|
|
|
|
by rhaps0dy
916 days ago
|
|
Its latent space transition is linear, instead of nonlinear, so there's a more parallelizable algorithm for advancing time in it. This makes it much more efficient to train and do inference with in GPUs. The way it keeps all the representation power of LSTMs is by having the transition vary with the input (but still be linear). |
|