|
|
|
|
|
by ctandre
3703 days ago
|
|
Do you mean to say that it is possible to design your parameters over all inputs without gradient descent? I'm somewhat confused, as I think that that would not be possible in the general case (e.g. nonlinear problems are hard to crack without resorting to an iterative procedure like gradient descent). I can see that gradient descent might still make sense for problems that do have clean analytic solutions (if that's what you meant), as those solutions often turn out to be junk at scale. Linear regression is a good example, as it has a nice closed form expression if the solution exists. But the complexity scales poorly as the naive implementation requires a matrix inversion, so a different method might be employed for a large problem - gradient descent could be a candidate. I think gradient descent is attractive because it's a memoryless process at the batch level - you can process training data in batches instead of processing the entire dataset in one go, without any explicit tracking of the previous batch history. This is a great feature when the scale of your dataset is mind-boggling. I think this is what you were suggesting? |
|