Hacker News new | ask | show | jobs
by rvarma 3043 days ago
I actually think the idea of using leaky ReLUs is interesting, because it'll still provide a small gradient when x < 0, which perhaps may slightly alleviate the vanishing gradients issue