Hacker News new | ask | show | jobs
by Jordanpomeroy 487 days ago
Very well explained to a lay person.

Are PINNs the current state of the art in ML methods for solving PDEs? What are their limitations?

3 comments

It depends on the PDE and what you want to do with it. A PINN requires:

1. Some example data or other way to add boundary conditions

2. Autograd over PDE constraints

3. A training loop incorporating both of those

And it produces

a. An approximate, differentiable, mesh-free solution

PINNs are most applicable when (1) is expensive (since that expense will apply more to traditional solvers, especially with fine meshes) and when the error in (a) is acceptable.

Regarding the error, PINNs are still extremely useful in generating an initial state to pass to a traditional solver even when the error is not tolerable, so that's not _really_ a concern. The main consideration is how expensive a particular problem is to solve classically. If it's too cheap, the PINN will never beat it.

You have a secondary consideration with (2) and (3). The training loop is a fixed cost which you can amortize over many executions, but you have to use the network enough times for that to actually pay off.

The last point I want to bring up is that you can sometimes get value from the extra features in (a). Perhaps you want to use the PINN to figure out where your mesh should be finer, or you have a derived field you want to inspect. Neural net gradients in general tend to poorly approximate real gradients if you only train on the function itself, but PINNs have the gradients you're likely to care about baked into their definition (and can thus approximate them well), and they'll model those much more cheaply than traditional solvers will.

We used them for a few things at my last job, and they were definitely worth it. We erred toward smaller (faster) nets with higher errors just to accelerate convergence with a classical solver.

I get that PINN is a less expensive approximate solution method. If so, how does it perform superior to many approximate, coarse numerical methods?
1. Those methods are coarse. The interpolation they provide is worse than what a PINN provides, meaning that equivalently performing PINNs (compard to coarse numerical methods) can easily and cheaply serve as better initializations for your finer numerical methods.

2. Go back to (1) from my previous message. For some intuition, fiddly solutions take a long time to optimize. Your only options (aside from spending more time and money) are tailoring the initial conditions and the algorithm for your particular problem. You see that a lot in, e.g., 1-3 atom quantum chemistry, where a good choice of basis functions is worth several papers. A neural network allows you to automagically bake everything that's hard about your problem into the training step and amortize those hard calculations across many experiments. It's not superior to enough man-centuries of human intuition, but it's dead simple to deploy, and for those sorts of hard problems it definitely beats a single human century of effort. Once you have a neural network output, the problem is well conditioned and suitable for refinement by a classical solver.

For a somewhat concrete example, imagine a problem where the space is largely uninteresting but there are a few tight swirls here and there. Coarse numerical methods can't really do anything with those. Adaptive-precision numerical methods can, but they're slow, and you have to re-run an intensive solving step for every new input. The PINN solution bakes everything that's hard about that into the neural net structure, and it solution will have approximately the right swirls in approximately the right places. If you want to refine them further, the fact that your solver doesn't have to dynamically handle resolution anymore and doesn't have to deal with any major phase shifts makes it much easier to iterate on via the normal classical methods.

Many thanks for your detailed input. For your concrete example, existing solutions employ FEM. My understanding of your point is that a NN abstracts away the meshing rules and learns the correct resolution in the areas of interest? I could see how this could be beneficial for quick solutions before a full blown solver.

If my above understanding is correct, than the following question is, why not use a NN to generate meshes directly? Let the classical solvers do what they do best: solve. Let NN do what they do best: take care of messy reality of geometry. This approach would actually give provable error bounds on the solution. I understand there are existing works on NN mesh generation, but I do not know any work that proves error bounds or has been incorporated into mainstream engineering software. Any hints?

(Thanks for this fascinating discussion.)

> [thank you]

You're welcome! Thank you for your questions! This has been a ton of fun on my part too.

> existing solutions employ FEM

FEM is pretty great for a lot of problems. Some fields are devoted to particularly tricky PDEs and go a long ways beyond that to generate asymptotically better solutions that don't generalize to other PDEs. Some easier problems have insights that improve on FEM.

> NN abstracts away the meshing rules and learns the correct resolution in the areas of interest

Something like that. That's certainly how I described it. The internals of a NN are a bit more wishy-washy, and for a variety of pathological problems (largely non-physical problems, e.g., ones devolving into rapidly tightening, infinite-curl spirals) none of the outputs will make intuitive sense when plotted against progressively finer-meshed classical solvers. For nice enough problems though (E&M with a smattering of QM, CFD, ...) that's approximately the net effect. Large portions of the weight space are devoted to interesting stuff, smaller portions to how it all fits together, and as you feed in information you'll naturally have more computation done with respect to the parts of the solution that need it.

> why not use a NN to generate meshes directly

If I'm understanding correctly, you're saying that this would contrasted with the current technique of generating outputs at various inputs. There's nothing wrong with that idea per se, but it's about as computationally intensive as generating approximate outputs at each of those mesh descriptors, so you might as well use that information. Combine that with the fact that a lot of these problems are extremely messy (how long does it take to transform gaussian noise or your favorite other initialization into a benzene molecule? (answer: weeks to months depending on your desired degree of accuracy, much worse for relativistic atoms)), and a classical solver will still struggle if all you do is give it a mesh and tell it to go wild.

> would give provable bounds on the solution

That point is a little interesting since it requires assumptions about the bounds of various partial derivatives in each region. In real-world problems, even when you can come up with such bounds, often you can't easily get those bounds to depend meaningfully on the mesh size (and certainly not on local properties in regions of widely spaced meshes). The net effect is that you can't prove much about the solution quality for an arbitrary PDE just based on mesh specifics (other than asymptotic information, which we can prove very easily).

> incorporated into mainstream engineering software

No clue, but I suspect not yet. Everything I've read about, used, written, or seen has been a one-off PINN.

> there are existing works on NN mesh generation, but ... proves error bounds

That's one of those things where I'd be strongly inclined to let the traditional software do its job. Much like using ChatGPT for recipe generation -- you might list some ingredients you do or don't want used, give the model a persona capable of cooking the thing you want to eat (to bias it away from the dregs of the internet), and ask for ideas and then maybe a followup or three. Once you have that result, you won't just blindly broil your shrimp at 550F for 180min; you'll independently verify the results (and still, hopefully, save time overall since you now at least know the right search terms and whatnot).

These NN results are similar. They're just approximations, and their best use IMO is feeding them straight into a tool that improves their accuracy and gives you known error bounds. The chief advantage is the speed with which you can obtain results, and the fact that the speed transfers when used as an initialization elsewhere is a very happy surprise.

> generate meshes directly

The status quo isn't bad for that for most classes of problems. For small, simple problems you'd never use a PINN. For large, complicated problems, classical techniques are so slow that the overhead of just sampling NN output to uncover the mesh isn't a huge problem. It's the in-between cases where you might want something smarter. I've seen a few papers and a few problems, but it doesn't look like there's a lot of interest. I'm not sure why, but collecting a few of those problems, trying to come up with real-world use cases where you'd need to solve tens of thousands or more of them (to make the PINN training worth it), and then using that as the backdrop for your project is probably the first direction I'd take if I had to work on that middle ground.

> any hints?

Let the NN do NN stuff. Expensively transform bulk data into a model that can much more cheaply approximate that data. PINNs use gradient information to replace most of that data (relying, then, on low sample counts of experimental or synthesized data). However you do it, the goal is to distill something expensive and messy into a model and then use the model to do something. Nobody has good error tracking via NNs, so don't use the NNs for that; use the NNs to feed data into a tool with good error tracking. Similarly with any other hard criterion.

what a well-written response. thank you.

may I ask which applications your work involve? your comments exhibit an exceptionally deep level of knowledge. As I mentioned in another comment, I am aware of some major authors' works (and was in the same research group at a point) but you exhibit a level of understanding uncommon even among those specialists.

Thank you!

As far as I can tell, PINNs are promising and an active research area, but they are also young and far from being as widely adopted as finite element methods (at least that's my experience academic environments).

I do see great improvements are being made both on the performance level but also on the applications.

One aspect I didn't discuss in the post is the use for inverse solution search, where you fit experimental data to your equation, and where your parameters and your initial conditions can also be trainable parameters. This has great potential to improve the methodology of experimental results analysis.

> Are PINNs the current state of the art in ML methods for solving PDEs? What are their limitations?

I guess in a way they are. They aren't new, they have been around since the 90s [1]. The problem with them is, you typically need to train them on a specific problem (boundary conditions, domain, equation, PDE coefficients etc). Compared to a traditional solver, the training is much slower, and on top of that the results are typically much less accurate. The PDE + NN community has a bit of a problem dealing with this in general [2], there are tons of papers that make NNs look much better at solving PDEs than they are compared to traditional solvers.

[1] https://www.cs.uoi.gr/~lagaris/papers/TNN-LLF.pdf

[2] https://www.nature.com/articles/s42256-024-00897-5