|
|
|
|
|
by dwiel
4411 days ago
|
|
Interesting to note though that even with a linear network that can be represented by a single matrix, it can be faster, easier and converge to better results with multiple layers because the different gradient and parameter space that is presented to the optimization algorithm. |
|