Hacker News new | ask | show | jobs
by mycall 2024 days ago
> why are networks with skip connections SO much better than networks without?

What are the leading theories for why this seems to be the case? Less nodes to capture and direct decisions?

1 comments

Oh there's plenty of good explantation in the neural network literature (my eli5: the skip connections make the default mapping an identity instead of a zero mapping; you can start by doing no harm, and improve from there). The method was suggested by knowledge from differential equations. All I'm saying is that the "everything is secretly an svm" viewpoint is probably too coarse to explain these interesting and effective structural differences.