Hacker News new | ask | show | jobs
by cjamsonhn 160 days ago
Highly recommend this as well. Does a great job of helping you build intuition for why things like gradient descent and normalization work. Also gets into the weeds on training dynamics and how to ensure they are behaving properly