| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jclos 3093 days ago
	Its simplicity is its power. More complex methods (e.g. second order methods) tend to get attracted to saddle points and produce bad results. Some metaheuristics like evolution strategies are also used in some specific cases (reinforcement learning). Minibatch gradient descent + reasonable minibatch size + some form of momentum is the best we have.