|
|
|
|
|
by SleekEagle
1515 days ago
|
|
Ultimately it comes down to gradient-based descent (which is pretty magical in its own right), but what's most surprising to me is that the loss landscape is actually organized enough to yield impressive results. Obviously the difficulties of training large NNs are well-documented, but I'm surprised it's even that easy |
|