|
|
|
|
|
by p1esk
2779 days ago
|
|
like needing to explicitly write a wrapper for the backwards calculation for custom layers, which you don’t need to do in Keras for example Not sure I understand - you will need to write a backwards pass regardless if you use Keras, PyTorch, or anything else. With Keras, you would need to modify the underlying backend code (e.g. with tf.RegisterGradient or tf.custom_gradient). With Pytorch you write the backward() function, which is about the same amount of effort. |
|
In PyTorch, you still do have to define the backward function and worry about bookkeeping the gradient, clearing gradient values at the appropriate time, and explicitly calling to calculate these things in verbose optimizer invocation code.
I encourage you to check out how this works in Keras, because it is simply just factually different than what you are saying, in ways that are specifically designed to remove certain types of boilerplate or overhead or bookkeeping that are required by PyTorch.