Hacker News new | ask | show | jobs
by isoprophlex 1821 days ago
If ReLU-introduced high frequency components are indeed the culprit, won't using "softened" ReLU (without discontinuity in the derivative at 0) everywhere solve the problem, too?