Hacker News new | ask | show | jobs
by ChrisRackauckas 2162 days ago
Yes, and our recent work https://arxiv.org/abs/2001.04385 gives a fairly general form for how to mix known scientific structural knowledge directly with machine learning. In fact, some of these PDE solvers are just instantiations of specific choices of universal differential equations. I agree that in many cases the "fully uninformed" physics-informed neural network won't work well, but we need to fully optimize a library with all of the training techniques possible in order to prove that, which is what we plan to do. In the end, I think PINNs will be most applicable to (1) non-local PDEs where classical methods have not fared well, so things like fractional differential equations, and (2) very high dimensional PDEs, like 100's of dimensions, but paired with constraints on the architecture to preserve physical quantities and relationships. But of course, something like a fractional differential equation is not an example for the first pages of tutorials since they are quite niche equations to solve!
1 comments

You've got a lot of broken references (??) in that preprint, BTW.

I think I understand why you're putting in the learned derivative operator, but I think it's rarely desirable. Computing derivatives with compatibility properties is a well-studied domain (e.g., finite element exterior calculus), as is tensor invariance theory (e.g., Zheng 1994, though this subject is sorely in need of a modern software-centric review). When the exact theory is known and readily computable, it's hard to see science/engineering value in "learned" surrogates that merely approximate the symmetries.

More generally, it is disheartening to see trends that would conflate discretization errors with modeling errors, lest it bring back the chaos of early turbulence modeling days that prompted this 1986 Editorial Policy Statement for the Journal of Fluids Engineering. https://jedbrown.org/files/RoacheGhiaWhite-JFEEditorialPolic...

>When the exact theory is known and readily computable, it's hard to see science/engineering value in "learned" surrogates that merely approximate the symmetries.

I completely agree, which is why the approach I am taking is to only utilize surrogates to think which are unknown or do not have an exact theory. I don't think surrogates will be more efficient than methods developed that exploit specific properties of the problem. In fact, I think the recent proof of convergence for PINNs simultaneously demonstrates this might be an issue (there was no upper bound to the proved convergence rate, but the one they could prove was low order).

>More generally, it is disheartening to see trends that would conflate discretization errors with modeling errors, lest it bring back the chaos of early turbulence modeling days that prompted this 1986 Editorial Policy Statement for the Journal of Fluids Engineering. https://jedbrown.org/files/RoacheGhiaWhite-JFEEditorialPolic....

Agree, this is a difficult issue with approaches that augment numerical approaches with data-driven components. There are ways to validate these trained components independent of the training data (i.e. by using other data), but validation will always be more difficult.

With enough coaxing, we can get the optimizer to converge to known methods (high-order, conservative, entropy-stable, ...), and I'm sure this tactic will lead to more papers, though they'll be kind of empty unless we're really discovering good methods that were not previously known.

I presume you meant "verify" in the last sentence.

No, what I am doing is using high order, conservative (universal DAEs), strong-stability preserving, etc. discretizations for the numerics but utilizing neural networks to represent unknown quantities to transform it into a functional inverse problem. In the discussion of the HJB equation, we mention that we solve the equation by writing down an SDE such that the solution to the functional inverse problem gives the PDE's solution, and then utilize adaptive, high order, implicit, etc. SDE integrators on the inverse problem. Essentially the idea is to utilize neural networks in conjunction with all of the classical tricks you can, making the neural network have to perform as small of a job as possible. It does not need to learn good methods if you have already designed the training problem to utilize those kinds of discretizations: you just need a methodology to differentiate through your FEM, FVM, discrete Galarkin, implicit ODE solver, Gaussian quadrature, etc. algorithms to augment the full algorithm with neural networks, which is precisely what we are building.

So I completely agree with you that throwing away classical knowledge won't go very far, which is why that's not what we're doing. We utilizing neural networks within and on top of classical methods to try and solve problems where they have not traditionally performed well, or utilizing it to cover epistemic uncertainty from model misspecification.

This looks really interesting.

I think it would be a good topic for a blog post or teaching paper that shows how to do this for very simple problems "end-to-end" (e.g. advection eqt, diffusion eq, advection-diffusion, burgers eqt., poisson eqt, etc.).

I see the appeal in showing that these can be used for very complex problems, but what I want to understand is what are the trade-offs for the most basic hyperbolic, parabolic, and elliptic one-dimensional problems. What's the accuracy? What's the order of convergence in practice? Are there tight upper bounds? (does that even matter?), what's the performance, how does the performance scale with the number of degrees of freedom, what does a good training pipeline look like, what's the cost of training, inference, etc.

There are well-understood methods that are optimal for all of the problems above. Knowing that you can apply these NN for problems without optimal methods is good, but I'd be more convinced that this is not just "NN-all-the-things hype" if I were to understand how these methods fair against problems for which optimal methods are indeed available.

No, it will not work well without the optimal method. But the method is no longer optimal if say a nonlinear term is added to these equations, so you can use the "optimal" method as a starting point and then try to nudge towards something better. Don't throw away any information that you have.
This comment sounds good. I was objecting to approaches like Eq 10 of your paper and much of the Karniadakis approach.