| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by optimalsolver 808 days ago
	We can name these hypothetical objects Recursive Neural Networks.

3 comments

whimsicalism 808 days ago

i know you're jesting but RNNs are recursive along the sequence length where I am describing recursion along the depth.

link

YeGoblynQueenne 807 days ago

Recursive NNs are not the same as Recurrent NNs:

https://en.wikipedia.org/wiki/Recursive_neural_network

Well ish. The article above explains that Recursive-NNs are hierarchical whereas RNNs are linear. I guess the distinction is a little on the fine side.

Anyway carry on. Pedantic moment over.

link

whimsicalism 807 days ago

The recursive neural networks described there are a failed academic project from more than a decade ago, predating modern deep learning. Basically everyone using the phrase recursive nn nowadays is probably just mispeaking for RNN. RNNs also are not linear

link

YeGoblynQueenne 805 days ago

I don't know about "everybody nowadays" but I remember Recursive Neural Nets as an architecture introduced by Christopher Manning with the argument that it was better suited to the hierarchical structure of language than existing architectures. I did find it a bit of a bad choice of name, given that it's so closed to Recurrent Neural Nets. All this is from memory though I might check the internets later to see what I misremember.

RNNs are a large class of architectures of varying complexity, from Kallman Filters to LSTMs. It's not clear to me exactly what the wikipedia article means by "linear" but LSTMs for example treat their inputs as sequences and don't try to deconstruct them into parts, like e.g. Convolutional Neural Nets do. So maybe that's what's meant by "linear".

link

ska 807 days ago

No opinion on the specifics of this distinction, but it's worth noting that in research, an awful lot of successful projects have their origins in failed projects of decades ago...

link

whimsicalism 807 days ago

My experience working in machine learning academia is an overfocus on failed projects from the early 00s to 90s that really only stopped in 2020+.

We can often trace back successful projects to failed precursors, but often the people behind the successful project are not even familiar with the failed precursor and the 'connection to the past' only really occurs in retrospect. See the 'adjoint state method' and connections with backprop.

link

ska 807 days ago

This is sometimes true, sure. And often the older work has more entered the general consciousness than being chased down by searching specific cites. On the other hand, very little is truly new, and recency bias can lead you into all sorts of back-eddy's.

Once the dust has settled, there are often much clearer through lines than in looked like at the time. It's hard to see when you are on the moving front though.

link

benreesman 808 days ago

Depthwise RNN?

link

refulgentis 808 days ago

Like decode the next token, then adjust what you're paying attention to, then decode it again?

link

nine_k 808 days ago

Isn't it the only way to, say,understand a pun?

link

refulgentis 808 days ago

That is exactly how LLM inference is performed, so I'm being cheeky (I'm 99% sure anyone proposing anything in this thread is someone handwaving based on limited understanding)

link

whimsicalism 808 days ago

You would be wrong, but that is fine. Been working with attention since 2018.

Why assume I know little and leave snarky comments (and basically a repetition of the prior joke at that, subbing RNN for transformer)?

link

refulgentis 808 days ago

To playfully invite for you to participate in conversation further, so that I may humbly learn from you. "I don't know what you're talking about" seemed too spartan and austere and aggressive, and you reciprocated politely, if again sparsely, when the other person playfully invited you to elaborate.

link

conradev 808 days ago

Yep: https://arxiv.org/abs/2305.13048

link

p1esk 808 days ago

We did: https://en.m.wikipedia.org/wiki/Recursive_neural_network

link