|
|
|
|
|
by dekhn
713 days ago
|
|
Thanks. The reason I asked the question is that I've struggled to understand RNNs and other networks (compared to MLPs, CNNs, and transformers) due to the subtlety of their design and my hope was that I could simply forget about them. I'm surprised about only testing for LSTMs- of all the sequence/memory models, they seem like the most arbitrary and hacky, but I've never been able to determine if that's simply because I don't understand those types of models (my training is in HMMs- do you teach/test those?) |
|
A lot of my research has focused on LSTMs, and so I am partial to them. I think they are super useful and have a lot of properties, but frankly speaking if you had to choose one architectures of the ones you mentioned, LSTMs/RNNs are probably the most OK to skip.
That said, if you just look at a simple RNN like the Jordan RNNs and focus on understanding that, then LSTMs just become fancy RNNs with some forgetting and remembering logic.