|
|
|
|
|
by bckr
1913 days ago
|
|
Agreed. It's really more like playing with Lego. Take residual connections, for instance. The insight was that information wasn't traveling far enough when the networks got too deep. So they just.... Plugged the earlier layers into the later layers. And this has been a very important development. Or things like batch norm. We don't know why it's important. People do math to try to explain what's going on, not so much to figure out where we should go next. Related: Understanding is a poor substitute for convexity
https://www.edge.org/conversation/nassim_nicholas_taleb-unde... |
|
It is pretty well understood.