| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sailingparrot 717 days ago
	Not if you want to be a PhD/Researcher in ML, yes otherwise. Source: Working on ML/LLMs as a research engineer for the past 7 years, including for one of the FAANG's research lab, always wanted to take time to learn about RNN but never did and never needed to.

3 comments

rolisz 717 days ago

Oh, I'm sure plenty of recent PhDs don't know about RNNs. They've been dropped like a hot potato in the last 4-5 years.

link

sailingparrot 717 days ago

I think to do pure research it’s definitely worth knowing about the big ideas of the past, why we moved on from them, what we learned etc.

link

derangedHorse 717 days ago

I haven’t read it in a while but I remember this post giving a good rundown of rnns

https://dennybritz.com/posts/wildml/recurrent-neural-network...

link

jszymborski 716 days ago

None of the students who have taken the classes I TA pass w/I learning about RNNs.

link

dekhn 716 days ago

Is that true also of LSTMs?

link

jszymborski 716 days ago

Yes. We cover Jordan and Elman RNN, LSTMs, and GRUs. Assignments only really test for LSTM knowledge, though.

link

dekhn 715 days ago

Thanks. The reason I asked the question is that I've struggled to understand RNNs and other networks (compared to MLPs, CNNs, and transformers) due to the subtlety of their design and my hope was that I could simply forget about them.

I'm surprised about only testing for LSTMs- of all the sequence/memory models, they seem like the most arbitrary and hacky, but I've never been able to determine if that's simply because I don't understand those types of models (my training is in HMMs- do you teach/test those?)

link

jszymborski 711 days ago

No, we don't teach HMMs (although that would be super cool). It's strictly a neural networks class.

A lot of my research has focused on LSTMs, and so I am partial to them. I think they are super useful and have a lot of properties, but frankly speaking if you had to choose one architectures of the ones you mentioned, LSTMs/RNNs are probably the most OK to skip.

That said, if you just look at a simple RNN like the Jordan RNNs and focus on understanding that, then LSTMs just become fancy RNNs with some forgetting and remembering logic.

link