Hacker News new | ask | show | jobs
Deriving the gradient for the backward pass of Layer Normalization (shreyansh26.github.io)
3 points by shreyansh26 379 days ago