Hacker News new | ask | show | jobs
by PeterisP 2951 days ago
"An enormous hyperparameter search and lots of elbow grease to carefully tune that dynamical system" i.e. the classic GDGS (gradient descent by grad student) approach where you have a grad student train a system, decide in which direction the parameters should be updated (i.e. look at the gradient), tweak the system and repeat until convergence.