|
|
|
|
|
by tadkar
2038 days ago
|
|
And would the 2 memory algorithm be equivalent to a gradient descent with momentum? I used to know what a sub gradient was, but I think there must be something more to the ideas in the paper because I’m struggling to see the analogy between gradient descent where you take steps probabilistically and the algorithm described. Perhaps I need to think about how you could potentially recast the quantile estimation problem as an optimisation problem and then apply what is effectively the machinery developed the train neural nets. Very interesting connection! |
|