Hacker News new | ask | show | jobs
by photochemsyn 488 days ago
Similar computational demand for useful results, too. The specific uses of linear algebra are a bit different: QM chemistry is about eigenvalue-solvers for large matrices repesenting the system Hamiltonian, HY = EY, which doesn't come into play in LLMs, where the linear algebra seem mostly used in chain-rule differentiation in matrix form.

There are similarities in some areas, eg gradient descent compared to self-consistent field (SCF) iterations in computational QM:

In Hartree-Fock or Kohn-Sham DFT:

        Guess a wavefunction (or density),

        Construct a Fock (or Kohn-Sham) matrix,

        Solve the eigenvalue problem for that matrix,

        Update the density,

        Repeat until convergence to a physically meaningful value for comparison to experimental observations.
In neural network training:

        Guess initial parameters,

        Compute a forward pass to get predictions,

        Evaluate a loss that measures prediction error,

        Backprop to find the max gradient of the loss function wrt parameters,

        Update parameter values via a small step in the opposite direction,

        Repeat until the model converges to a good-enough solution that pleases the human user.
Neither has much to do with the original article, though.