Hacker News new | ask | show | jobs
by zwaps 773 days ago
I mean this is implicit in back propagation, say, you need to store gradients anyway but if you get to a zero loss than you are just done.