Hacker News new | ask | show | jobs
by whimsicalism 808 days ago
Yes, I think training this model would be hard. Perhaps something akin to how MoEs are trained where you impose some sort of loss distribution to encourage equitable routing, but for recursion.
1 comments

Look at the human brain for useful analogies?

The default mode network does recursive/looping processing in the absence of external stimuli and world interaction. Multiple separate modules outside of the network are responsible for stopping and regulating this activity.