|
|
|
|
|
by kxyvr
809 days ago
|
|
This is not entirely true. A typical trick for accomplishing this is to simply use the Hessian approximation `g'(x)*g'(x)` (star denotes adjoint), thus cutting off the second-derivative evaluation of g, and then use a modified CG method like Newton-Krylov for line-search methods or Steihaug-Toint for trust-region methods. It's very robust, matrix-free, and doesn't require an excessive amount of code to implement. Further, the method can be preconditioned with a positive definite preconditioner. And, to be clear, there's no guarantee that a zero residual be found, only that `g'(x)*g(x)=0`, but that tends to be the nature of nonlinear equations. |
|
It is not robust, in fact it's well-known as a numerically unstable method. g'(x)*g'(x)` is numerically unstable because it squares the condition number of the matrix. If you have a condition number of 1e-10, which is true for many examples in scientific problems (like the DFN battery model in the paper), the condition number of J'J is 1e-20, which effectively means a linear solve is unable to retrieve any digits of accuracy with 64-bit floating point numbers. So while I agree that you can use symmetric linear solvers as a small performance improvement in some cases, you have to be very careful when you do this as it is a very numerically unstable method. For some details, see https://math.stackexchange.com/a/2874902
Also, if you look at Figure 5 in the paper, you will see that there is a way to choose operator conditioning https://docs.sciml.ai/LinearSolve/stable/basics/OperatorAssu..., and if you tell it to assume the operator is well-conditioned it will actually use this trick. But we do not default to that because of the ill-conditioning.