Hacker News new | ask | show | jobs
by sdenton4 806 days ago
The training slowdown is not really a problem... There's a pretty wide range of robust, good-enough values that don't slow things down much at all. As with all optimizer cruft, the 'optimal' value is going to be problem-dependent and a pain in the butt to actually find. So it's best to find a good-enough value that works in most contexts and not worry about it.