| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by londons_explore 808 days ago
	I think the reason this hasn't been done is you have no way to decide how many recursions are necessary at train time. And if you pick a random number/try many different levels of recursion, you 'blur' the output. Ie. the output of a layer doesn't know if it should be outputting info important for the final result, or the output that is the best possible input to another round of recursion.

4 comments

whimsicalism 808 days ago

Yes, I think training this model would be hard. Perhaps something akin to how MoEs are trained where you impose some sort of loss distribution to encourage equitable routing, but for recursion.

link

hackerlight 807 days ago

Look at the human brain for useful analogies?

The default mode network does recursive/looping processing in the absence of external stimuli and world interaction. Multiple separate modules outside of the network are responsible for stopping and regulating this activity.

link

pizza 808 days ago

You could just learn the right estimated number of recursions, also passing 'backtracking'/'state' information at the next nested level. Kind of like how state space models encode extractible information via a basis function representation, you could encode extractible recursion state information into the embedding. See also transformers that can learn to recognize n-deep balanced parentheses (Dyck-n languages)

link

Nesco 807 days ago

I have been thinking about this topic for some time. It might be done using the energy of the token. If it's still higher than an energy limit, then process it again, and increase the energy limit. The energy could be computed using log-sum-exp: https://openreview.net/pdf?id=Hkxzx0NtDB

link

sdenton4 808 days ago

This is actually how EfficientNet trains, using random truncation of the network during training. It does just fine... The game is that each layer needs to get as close as it can to good output, improving in the previous activation quality.

link