|
|
|
|
|
by miven
914 days ago
|
|
>a major feature of transformers being wildly faster inference than with LSTM Wasn't the main issue with RNNs the fact that inference during training can't be efficiently parallelized? The inference itself normally should be faster for an RNN than for a transformer since the former works in linear time in terms of input size while the latter is quadratic |
|