Y
Hacker News
new
|
ask
|
show
|
jobs
by
Bayes7
937 days ago
Okay, I see that for inference. But for training it shouldn't matter because I need to hold on to all my activations for my backwards pass anyways? But yeah, fair point!