| To plug my own field a bit, in material science and chemistry there is a lot of excitement in using machine learning to get better simulations of atomic behavior. This can open up exciting areas in drug and alloy design, maybe find new CO2 capturing material's or better cladding for fusion reactors, to name just a few. The idea is that to solve these problems you need to solve the schrodinger equation (1). But the schrodinger equation scales really badly with the number of electrons and can't get computed directly for more than a few sample cases. Even Density Functional Theory (DFT), the most popular approximation that still is reasonably accurate scales N^3 with the number of electrons, with a pretty big pre factor. A reasonable rule of thumb would be 12 hours on 12 nodes (each node being 160 cpu cores) for 256 atoms. You can play with settings and increase your budget to maybe get 2000 (and only for a few timesteps) but good luck beyond that. Machine learning seems to be really useful here. In my own work on aluminium alloys I was able to get the same simulations that would have needed hours on the supercomputer to run in seconds on a laptop. Or, do simulations with tens of thousands of atoms for long periods of time on the supercomputer. The most famous application is probably alphafold from deep mind. There are a lot of interesting questions people are still working on: What are the best input features? We don't have any nice equivalent to CNNs that are universally applicable, though some have tried 3d convnets. One of the best methods right now involves taking spherical harmonic based approximates of the local environment in some complex way I've never fully understood, but is closer to the underlying physics. Can we put physics into these models? Almost all these models fail in dumb ways sometimes. For example if I begin to squish two atoms together they should eventually repel each other and that repulsion force should scale really fast (ok maybe they fuse into a black hole or something but we're not dealing with that kind of esoteric physics here). But, all machine learning potentials will by default fail to learn this and will only learn the repulsion to the closest distance of any two atoms in their training set. Beyond that and the guess wildly. Some people are able to put this physics into the model directly but I don't think we have it totally solved yet. How do we know which atomic environments to simulate? These models can really only interpolate they can't extrapolate. But while I can get an intuition of interpolation in low dimensions once your training set consists of many features over many atoms in 3d space this becomes a high dimensional problem. In my own experience, I can get really good energies for shearing behavior of strengthening precipitates in aluminum without directly putting the structures in. But was this extrapolated or interpolated from the other structures. Not always clear. (1) sometimes also the relativistic Dirac equation. E.g. fast moving moving atoms in some of the heavier elements move at relativistic speeds. |
Context for the non-mat-sci crowd - numerically solving Schrodinger essentially means constructing a large matrix that describes all the electron interactions and computing its eigenvalues (iterated to convergence because the electron interactions are interdependent on the solutions). Density functional theory (for solids) uses a Fourier expansion for each electron (these are the one-electron wave functions), so the complexity of each eigensolve is cubic in the number of valence electrons times the number of Fourier components
The tight binding approximation is cool because it uses a small spherical harmonic basis set to represent the wavefunctions in real space - you still have the cubic complexity of the eigensolve, and you can model detailed electronic behavior, but the interaction matrix you’re building is much smaller.
Back to the ML variant: it’s a hard problem because ultimately you’re trying to predict a matrix that has the same eigenvalues as your training data, but there are tons of degeneracies that lead to loads of unphysical local minima (in my experience anyway, this is where I got stuck with it). The papers I’ve seen deal with it by basically only modeling deviations from an existing tight binding model, which in my opinion only kind of moves to problem upstream