|
|
|
|
|
by acadien
2061 days ago
|
|
Saddles are a way of conceptualizing high dimensional optimization problems. If you have a 3 dimensional surface you can imagine a saddle as an isocurve that follows a minima in at least one dimension. Another way to conceptualize these is to think of being at the minima of a parabola in 2 dimensions, but then seeing you're not in a minima in a 3rd dimension. Any time you're in a minima in at least 1 dimension, you're on a saddle. You can extend this concept to a neural net which lives in millions of dimensions, undergoing SGD. When beginning an optimization run SGD moves in some direction to minimize the a bundled cost, inevitably stumbling into minima in (usually) many dimensions. Subsequent iterations will shift some dimensions out of minima and other dimensions into minima, the net is always living on a saddle during this process. There are many papers that discuss the process in these terms and others that implicitly use it. I wouldn't say its a "hot area of research" but more of a tool for thinking about these processes and sometimes gaining some insight in to why things get stuck during training. |
|