Hacker News new | ask | show | jobs
by metalens 2341 days ago
I used to do electromagnetic modeling using finite element methods (though now a product manager for AI software infra) and it would to take me on the order of hours to days or weeks to model wave interaction with real-world objects.

A machine learning model trained to understand Maxwell's Equations can in principle be used perform said simulations, resulting in probably an order or more of magnitude increase in simulation speed. Getting this to work well will reduce the time (and cost) it takes to design optical sensors, radar for autonomous vehicles, smartphone antennas, MRI machines, and more.

Having said that, it would require a lot of heaving lifting to pull this off to achieve near-physical accuracy for real-world physics problems.

A cursory search on Google for "arxiv deep learning electromagnetics" returns results of proofs of concept in this direction.

2 comments

Were would the speedup come from? I don't understand.

If I understand your comment correctly, essentially you have a hand-crafted simulator for some physical process and then you train a neural net model to approximate the simulator. Why would the approximated simulator have "an order or more of magnitude increase in simulation speed"? Unless the approximation has massive losses in accuracy, of course.

Honestly asking and really interested to know what you mean.

It's all about precision heuristics, derived from joint probabilities of inputs and outputs. That, by and large, is how I am increasingly coming to understand the power of neural networks.

Imagine you are given a picture of a candle, overlaid with a grid, and asked to fill in, with colored pencils, colors for the air surrounding the candle representing relative temperature. Of course a human utilizes intuition to rapidly assign high temperature to the flame and decreasing temperature with increasing distance.

A "dumb" finite method would need, even for such a relatively simple problem (for a human), to perform calculations for a series of time steps in each grid until some steady state condition to arrive at a much more precise but still overall similar coloring of the grid cells. You can do the same task much more quickly because you have developed intuition of the physics, which is to say you have learned heuristics which capture the general trends of the problem (air is hot close to a flame and cold far away).

Neural nets take the best of both worlds - by effectively learning probability relationships between input and output pixels, they internalize heuristic approaches to produce outputs approaching finite method accuracies at a fraction of the computation. There's a lot of waste that can be optimized out of finite computation by hardcoding rules (heuristics), but doing so for real problems is impractical. Neural nets learn these rules through training - a far simpler task is organizing the data to teach the net the right trends; much like designing lessons for a child to teach a predictive ability.

I'm skeptical of the claim that it's easier to train a neural net than to hand-code a set of heuristics _when the heuristics are already known_. For the time being, optimal results with neural nets need more data and more computing power ("more" because it's never enough) and are primarily useful when a hand-coded solution is not possible.

I also don't understand how it is possible for a neural net (or any approximator, really) to approximate a "precision heuristic" faster than a hand-coded heuristic and without a gross loss of well, precision in the order that would make the results unusable for engineering or scientific tasks. Could you elaborate?

I’m also skeptical, but after reading the explanation above, I am intrigued.

Say I have a cube with 100 x 100 x 100 mesh cells inside, and ports on opposing faces. Given enough time, I can literally run through every possible combination of PEC and air for every cell and solve the FD form of maxwells equations, then save the results. Now, a user can ask my solver for any of those cases, and I simply pull the presolved result, and give the user the answer with orders of magnitude reduction in time.

Obviously, the presolving approach doesn’t scale. More materials, more mesh cells, eventually it is impractical to presolve every case. But the beauty of neural networks is that they can be very good at generalizing from a partial sample of the problem space. In effect, they can give results close enough to the presolve solution with drastically reduced numbers of computations.

>> But the beauty of neural networks is that they can be very good at generalizing from a partial sample of the problem space.

That is really not the case. Neural nets generalise very poorly, hence the need for ever larger amounts of data: to overcome their lack of generalisation by attempting to cover as many "cases" as possible.

Edit: when this subject comes up I cite the following article, by François Chollet, maintainer of Keras:

The limitations of deep learning

https://blog.keras.io/the-limitations-of-deep-learning.html

I quote from the article:

This stands in sharp contrast with what deep nets do, which I would call "local generalization": the mapping from inputs to outputs performed by deep nets quickly stops making sense if new inputs differ even slightly from what they saw at training time. Consider, for instance, the problem of learning the appropriate launch parameters to get a rocket to land on the moon. If you were to use a deep net for this task, whether training using supervised learning or reinforcement learning, you would need to feed it with thousands or even millions of launch trials, i.e. you would need to expose it to a dense sampling of the input space, in order to learn a reliable mapping from input space to output space.

Well...I think that take is a little overly cynical, and I disagree particularly with this:

>the mapping from inputs to outputs performed by deep nets quickly stops making sense if new inputs differ even slightly from what they saw at training time

In my experience that isn't really true, if you have an appropriately designed net, training data which appropriately samples the problem space, and the net is not overtrained (overfit).

You can think of training data as representing points in high dimensional space. Like any interpolation problem, if you sample the space with the right density, you can get accurate interpolation results - and neural nets have another huge advantage, in that they learn highly nonlinear interpolation in these high d spaces. So the net may be unlikely to generalize to points outside of the sampled space - although now that I think of it I'm not sure of how nets handle extrapolation - but when you're dealing with space with thousands of dimensions (like each pixel in an image) you can still derive a ton of utility from the interpolation which effectively replaces hardcoded rules about the problem you're solving.

It doesn’t need to generalize, just do sophisticated interpolation.

Basing the results on a dense sampling of the input space is exactly what I was suggesting.

Well, there are pretty convincing examples in other domains: try hardcoding rules to classify animals or objects in photos, especially an algorithm which can handle thousands of different categories. Totally impractical - but if we appropriately design the net and structure the training data, you can train a pretty accurate net on a mid-range GPU in a matter of hours to do what would take far, far longer to hardcode!

Perhaps not quite appropriate to call them heuristics in this context, but the principle is the same - you are leveraging joint probabilities of pixels to generate some conditional output. Similar principle in ML accelerated modeling.

I think I understand what you meant by heuristics. I agree that it's impractical to try and hand-code image recognition rules and all attempts to do that in the past have failed as they have in similarly complex domains (like machine translation, say). My concern is particularly about the use of neural networks (or in general machine learning models that learn to approximate a function) in domains where precision is normally required, like engineering. I mean, I know there's plenty of approximation in engineering already but of course we're not talking about computing integrals here (er, I think?).

Anyway I was especially trying to understand the OP's comment about speedup using a neural network. I'm still a bit confused about that. But thanks for the conversation.

You're on the right track. A lot of this tech is a potential goldmine and I'm sure there are many players developing in secret and not publishing yet (or ever).