Hacker News new | ask | show | jobs
by sailingparrot 717 days ago
Not if you want to be a PhD/Researcher in ML, yes otherwise.

Source: Working on ML/LLMs as a research engineer for the past 7 years, including for one of the FAANG's research lab, always wanted to take time to learn about RNN but never did and never needed to.

3 comments

Oh, I'm sure plenty of recent PhDs don't know about RNNs. They've been dropped like a hot potato in the last 4-5 years.
I think to do pure research it’s definitely worth knowing about the big ideas of the past, why we moved on from them, what we learned etc.
I haven’t read it in a while but I remember this post giving a good rundown of rnns

https://dennybritz.com/posts/wildml/recurrent-neural-network...

None of the students who have taken the classes I TA pass w/I learning about RNNs.
Is that true also of LSTMs?
Yes. We cover Jordan and Elman RNN, LSTMs, and GRUs. Assignments only really test for LSTM knowledge, though.
Thanks. The reason I asked the question is that I've struggled to understand RNNs and other networks (compared to MLPs, CNNs, and transformers) due to the subtlety of their design and my hope was that I could simply forget about them.

I'm surprised about only testing for LSTMs- of all the sequence/memory models, they seem like the most arbitrary and hacky, but I've never been able to determine if that's simply because I don't understand those types of models (my training is in HMMs- do you teach/test those?)

No, we don't teach HMMs (although that would be super cool). It's strictly a neural networks class.

A lot of my research has focused on LSTMs, and so I am partial to them. I think they are super useful and have a lot of properties, but frankly speaking if you had to choose one architectures of the ones you mentioned, LSTMs/RNNs are probably the most OK to skip.

That said, if you just look at a simple RNN like the Jordan RNNs and focus on understanding that, then LSTMs just become fancy RNNs with some forgetting and remembering logic.