Hacker News new | ask | show | jobs
by gdiamos 635 days ago
backward activations does but typically not backwards weight gradients.

Why compute gradients with regards to weights that aren't going to be updated?