Hacker News new | ask | show | jobs
by roger_ 87 days ago
Great idea and seems quite obvious in hindsight.

Is it guaranteed to have the same effect on vanishing gradients though? What if it put weight 1 on a layer that had a tiny gradient?