|
|
|
|
|
by nshm
3703 days ago
|
|
Strictly speaking if you split the parameter set on batches and iterate over batches optimizing each set of parameters with a gradient, it is not strictly a gradient decent, it is more a combination of coordinate decent (because you select the subset of coordinates to optimize first) and a gradient decent. |
|