|
|
|
|
|
by GChevalier
2727 days ago
|
|
Also, I forgot:
to improve convergence, I'd use an Hann-Poisson window such as here: https://en.wikipedia.org/wiki/Window_function#Hann%E2%80%93P... I'd apply the window to randomly-sampled mini-batches of consecutive points instead of optimizing the neural network on just randomly-sampled batch points or on all the dataset at once. I guess that using an Hann-Poisson window will make the "gradient" valley easier to "ski down" with gradient descent which is a greedy algorithm. I guess that the spectral leakage caused by the Hann-Poisson window function will make the gradient landscape more monotonically decreasing in every point towards the global minima. |
|