|
|
|
|
|
by fjkdlsjflkds
720 days ago
|
|
OP does not (seemingly) claim that tinygrad can't compute hessians, only that a first-order optimization method was the only thing tried. Also, as a quasi-newton method, L-BFGS does not require explicit (pre-)computation of the hessian (it implicitly iteratively estimates its inverse in an online manner). |
|