Hacker News new | ask | show | jobs
by GChevalier 2727 days ago
Also, I forgot: to improve convergence, I'd use an Hann-Poisson window such as here: https://en.wikipedia.org/wiki/Window_function#Hann%E2%80%93P...

I'd apply the window to randomly-sampled mini-batches of consecutive points instead of optimizing the neural network on just randomly-sampled batch points or on all the dataset at once. I guess that using an Hann-Poisson window will make the "gradient" valley easier to "ski down" with gradient descent which is a greedy algorithm. I guess that the spectral leakage caused by the Hann-Poisson window function will make the gradient landscape more monotonically decreasing in every point towards the global minima.