|
|
|
|
|
by osipov
715 days ago
|
|
What's your basis for claiming that Tinygrad can't compute 2nd order partial derivatives (i.e. Hessians) needed for LBFGS? Tinygrad like PyTorch uses automatic differentiation which has no problem supporting nth order derivatives. |
|
Also, as a quasi-newton method, L-BFGS does not require explicit (pre-)computation of the hessian (it implicitly iteratively estimates its inverse in an online manner).