|
|
|
|
|
by ActorNightly
395 days ago
|
|
You are right with respect to ordering of operations, where recurrent networks have a whole bunch of other computational complexity to them. However, for example, a Transformer can be represented with just deeply connected layers, albeit with a lot of zeros for weights. |
|