|
|
|
|
|
by magicalhippo
761 days ago
|
|
As mentioned this is difficult. AFAIK the main reason is that the power of neural nets come from the non-linear functions applied at each node ("neuron"), and thus there's nothing like the superposition principle[1] to easily combine training results. The lack of superposition means you can't efficiently train one layer separately from the others either. That being said, a popular non-linear function in modern neural nets is ReLU[2] which is piece-wise linear, so perhaps there's some cleverness one can do there. [1]: https://en.wikipedia.org/wiki/Superposition_principle [2]: https://en.wikipedia.org/wiki/Rectifier_(neural_networks) |
|