Hacker News new | ask | show | jobs
Technical milestone reached: global earth system simulations with 1.2 km resoln (mpimet.mpg.de)
162 points by jokabrink 1352 days ago
14 comments

"I don't know, Timmy, being God is a big responsibility" https://qntm.org/responsibility , Topic relevant short scifi on Earth simulation.
Also relevant and mind-expanding essay:

Simulation, Consciousness, Existence

Hans Moravec, 1998.

https://frc.ri.cmu.edu/~hpm/project.archive/general.articles...

Thank you for sharing that essay. I found it fascinating.
Devs is a great mini-series that has similar themes.
Also IIRC, Devs is loosely based on the above linked short story.
So great. “Uh-oh” for me was one of those Lovecraftian moments of unraveling.
Being curious about the implementation language. Viewing the code is not easy.

If you guess Fortran, you might be right:

(different ICON Project) "The infrastructure, ICON-Land, for this ICON-A land component has been newly designed in a Fortran2008 object-oriented, modular, and flexible way."

https://mpimet.mpg.de/fileadmin/publikationen/Reports/WEB_Bz...

Fortran alive and kicking:

https://developer.nvidia.com/cuda-fortran

Most of what they do in the Climate simulation arena uses fortran, this case not being an exception, reason being other language would be less efficient --- and thus more climate endangering emissions would be produced. Plotting and data wrangling is another story.
They don't use FORTRAN because of efficiency or emissions reductions lol, they use FORTRAN because they always did and the sort of scientists who do modelling rarely see any reason to upgrade their skills or tools.

Programming is something scientists tend to study only as far as needed to get results that look right. This is how the most influential COVID model ended up being a 15,000 line student-quality C program with hundreds of single-letter name global variables.

They definitely stick to it, and it might be that they build convenient narratives about it. Mind you that the energy use of climate simulations is not trivial, or do you think such machine can be fed with a typical electric installation from a regular office?
Fortran is no longer in ALL CAPS.
When will people learn it..
I had hoped for Julia.
If the margins are so tight that 1 day can only produce 2.5 days of simulation, why to lose even a small margin with Julia?
How do you know they would lose a significant margin with Julia?
There were talks about it in the Climate modelling arena, and around a 10% slower was mentioned in some discussions.
Well, if that's not hard data!
The article talks about 1.2km horizontal resolution. Is it a 3d grid? If so what is the vertical resolution? Or is the vertical dimension integrated within 2d boxes?
Yes, these are fully 3D simulations. The vertical resolution is slightly harder to describe because models like these typically use more complicated vertical coordinates than just "meters above ground". E.g. a technical article on the ICON-A formulation [1] says that it uses a "hybrid sigma" coordinate - so something that follows a combination of pressure and terrain. For a very high-resolution simulation that resolves clouds, you'd need to crank this up to O(100 m) in the bottom of the model, but it can space out as you get higher in the atmosphere.

[1]: https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2017MS00...

How does such a simulation work?

Eg how do hou predict the temperature at x,y? Is it ground type, water, sand?Altitude? Neighboring values thereof?

What are the inputs? Do you give it a starting point and apply it to a bunch of elements like some giant automata like game of life?

Some kind of finite element analysis thing?

So many questions.

Here is a very brief and basic introduction to numerical simulation:

Simulation of dynamic systems is a big deep area. In general you use what is called numerical simulation where you have a model describing your system, in the form of a partial differential equal equations.

You start with the chosen initial conditions, choose a delta-t as your time increment, and solve the equation for those inputs. That result is the input to the next iteration.

The most basic algorithm to solve such an equation is “Newton’s method” but no one actually uses that, they use many more advanced methods. But if you are learning that is where you start.

This approach has advanced greatly over the last 70 years. Doing numerical simulation is why early computing work got funding, to simulated nuclear reaction inside bombs.

Now numerical simulation is the occupation of all the worlds top super computers. It’s used for climate simulation, bridge strength, how sky scrapers flex in the wind, testing car crashes or even simulating the strength of ceramics. Oh and it used a lot in financial simulations to model risk and calculate the price of assets.

Right now I'm playing with simulated weather systems using an automata for each grid location at an effective resolution of a square km. I'm getting predictive real-world accuracy within around 10 degrees C with a range of 2 days. Very rough. Takes a long time to simulate a globe which I've found is really important to do. A limited region is usually not as useful.

Its an interesting field. But its seems not so easy to get the real methods used by the bigger models.

(I'm not using a supercomputer...)

Such simulations usually consist of three major systems (atmosphere, land, and ocean) that are coupled together at their geometric boundaries by a coupler that 'communicates' values like temperature from one domain to another. The coupler is needed because of different grid geometries, time step size differences and other aspects.

You initialize the system at some known state (I.e. set the temperature, pressure, etc. at all grid points to real world measurements) and then integrate a complex differential equation for the next time step and so forth. So it is not like a automaton. Finite elements analysis comes closer, but I think they use a different scheme like finite volume methods.

A lot of insight can be gained by [this](https://pure.mpg.de/rest/items/item_3379183/component/file_3...) paper. The first 10 pages should give you a rough overview.

> Do you give it a starting point and apply it to a bunch of elements like some giant automata like game of life?

Roughly speaking yes. Divide all into a grid of cells. Model a cell state with a bunch of numbers, apply some rules to update cell state with neighbors. The trick is to figure out rules of updating state. One needs to write differential equations first, incorporating all relevant physical processes into them, and to transform equations into those rules of updating, which will be a way more complex than with Game of Life.

Though it may be even more complex, like different time steps at different time-points, or changing a grid of cells to increase details in some areas where much is going on by slowing down simulation. Most of complications are due to a limited abilities of our computers: the idea to get more precision by calculating less.

giant partial differential equations
They are simpler than you think.
Try proving that their solution is unique.
Simple does not mean trivial.
If only there was a way to find out...
I know, right? But together we can still dream.
This is fantastic - would be really exciting to see this level of resolution making its way to global operational weather forecasts (currently at ~10km).
It's far too expensive with dubious impacts on forecast quality. Adaptive mesh approaches are far more suitable for high res global weather modeling... Why simulate the area under boring, dynamically unimportant areas?
> Why simulate the area under boring, dynamically unimportant areas?

Butterflies... terrifying large, kilometer-scale butterflies.

Which is *exactly* why ultra-high resolution global weather simulation has dubious prospects for improving forecasts. When you're at spatial scales where you need to parameterize convection, there's an inherent "smoothness" to model solutions that suppresses noisy errors. If you go to cloud-resolving scales - which is needed for simulations like the ones here - you don't get the benefit of that smoothness anymore, because you need to actually resolve scales of motion that are incredibly fine. It's a losing proposition; you'll never get it "perfect", so you're much more likely to spin up an error cascade with significant impacts on forecast down the line, through things like the structure of organized convection.

But dynamically uninteresting, quasi-balanced setups and modes? There's far less to worry about in terms of the butterfly effect, and any errors you might worry about will be dwarfed by the fact that we don't have good data to assimilate in places like the remote oceans anyways.

It's also worth pointing out that the mathematics and understanding of error / perturbation growth in the atmosphere are well-understood. In fact, this fundamentally underpins how we've developed data assimilation approaches over the past two or three decades that allow us to effectively leverage new datasets such as satellite data to increase forecast quality and reliability at longer lead times. So it's somewhat trivial to actually directly quantify these "butterflies."

If we're ever going to get to the femtometer resolution required for very precise 100 day weather forecasting, we have to start somewhere, so let them waste their time. It's not as though this is part of a growing trend to abandon conventional weather and climate modeling.
Why do you think that we need "femtometer resolution" for "very precise" 100 day weather forecasting? What even is "very precise" 100 day weather forecasting? I think it's very amusing to do the math on how much memory would be required to run a crude primitive equation dycore over even the tiniest of domains at femtometer resolution :)

> It's not as though this is part of a growing trend to abandon conventional weather and climate modeling.

The thing is, there *absolutely is* a trend towards private investment in weather modeling going towards faux-moonshot ideas like cubesat constellations without demonstrated ROI and that would require evolutionary leaps forward in data assimilation, or for deep learning to replace weather models. A miniature version of this already played out with precipitation nowcasting - probably the easiest weather forecasting problem that you could approach with an AI system, yet the approaches that have been developed so far barely improve over optical flow or other simple approaches, let alone advance our capability to forecast, say, convective initiation.

The future of weather forecasting is larger ensembles (O(100-500) ensemble members, across 2-5 different models) of near-convective-resolving global models at meso-gamma (2-10 km resolution) fed into slightly more sophisticated statistical post-processing systems - almost certainly trained using simple AI/ML techniques on large-scale reforecasts of these parent model systems, or brute-forcing purely Bayesian statistical approaches.

You want to know the precise shape of the Earth's surface in femtometer precision?

There are some profound problems with that idea once you get below 10 meter or so, but I'll let you think that one through yourself.

there's a very good physical argument that this is impossible. if you want to store 1 bit per femptometer simulated, at current computer sizes, we are taking about a computer billions the size of the earth. even if you use 1 atom per bit, your computer will be almost as big as the earth. such a computer will collapse under it's own gravity.
It's extremely unlikely that we'll ever get anywhere near that. Even meter precision is impractical.
And the butterflies are full of hate.

https://www.schlockmercenary.com/2017-07-13

As I have reluctantly learned today.
At least three industries will be grateful for high-res wind predictions: aerospace, maritime and wind-generation.
This is for short term weather prediction or long term climate modelling? The 2.5 simulated days per day points into the short term direction.
As mentioned in the article, the system is based on the ICON Earth System model[1], which has the following description:

The Earth system model provides a numerical laboratory for research on the climate dynamics on time scales of a season to millennia. Necessarily most processes are parameterized to allow the computationally efficient integration over long periods.

It's also mentioned it will contribute to DestinE[2]:

Destination Earth (DestinE) aims to develop – on a global scale - a highly accurate digital model of the Earth to monitor and predict the interaction between natural phenomena and human activities. [...] The initial focus will be on the effects of climate change and extreme weather events, their socio-economic impact and possible adaptation and mitigation strategies.

[1]: https://mpimet.mpg.de/en/science/models/icon-esm/

[2]: https://digital-strategy.ec.europa.eu/en/policies/destinatio...

I am working on DestinE with workflow managers and posted my first entry in Who's Hiring some days ago. Lots of work and interesting challenges if anyone is interested in earth system models, NWP, workflows, HPC, GPU's, data formats, etc. Not only on my company, the BSC in Spain, but also in other countries/companies as well.
I'm not sure the added precision is helpful if it's just 2.5 days ahead. An extra week of accurate forecasts is a bit pointless if it takes 8 days.
Simulating 8 days would take 8/2.5 = 3.2 days.
When you’re trying to assess imminent danger from an approaching tropical cyclone, days and hours count. A lot.
I wouldn't think that a temporal resolution measured in days is needed or all that helpful for climate modeling.
The real key of this story is in the "simulating — rather than parameterising".
This sounds alot less than Nvidia's FourCastNet?

https://resources.nvidia.com/en-us-fleet-command/watch-27?xs...

How do these two compare?

They're complete different classes of models.

The modeling system in the linked article is a high-fidelity numerical simulation of the coupled Earth system. It's a giant PDE solver for Navier-Stokes applied to the Earth's atmospheres and oceans, coupled together with a great deal of additional physics simulation. The intent is to reproduce, in simulation, the Earth's atmospheric and oceans with the highest fidelity. This set of simulations is the culmination of nearly 70 years of investment, going back to the very first applications of digital computers for solving complex math equations (one of the first simulations bought for ENIAC was a crude quasi-geostrophic atmospheric mode / weather forecast).

NVIDIA's FourCastNet, while very cool, is quite literally a facsimile of this type of system. It's really not even in the same ballpark.

You realize "facsimile" means "copy", right? You didn't explain how they are different.
It's a deep learning based emulator of a full-complexity NWP system, but it is far from a production-ready or operational technique.
It’s an example of a surrogate model. It’s an ML model trained on the output of large numerical simulations like the OP, rather doing the simulation itself.

Surrogate models are nice because they can emulate the output of the full fidelity calculation in a fraction of the runtime, but they typically are trained within a range of validity outside of which they cannot reliably extrapolate.

I have only one question. Does this mean that we are getting closer to a cloud-hosted ultimate map for games like Call of Duty and Ace Combat?

I want to dogfight over Ohio, land at Offutt to play Warzone in Omaha, then take a MRAP and drive to NY.

1:1 scale multiplayer world maps have existed for decades, e.g. in Microsoft Flight Simulator.

The difficulty of making a "large" map comes from what you want to simulate and in how much detail, not from how big it is per se.

This pretty much.

ED has an approximation of the entire milky way in an MMORPG, where you can visit individual planets of around 400 billion star systems. These are obviously "generated" except for maybe a few handcrafted systems like Sol.

The problem really isn't "size".

And sometimes you don't need to simulate. MSFS pulls in live metar data, though previously they used Meteorblue forecast data.
Since I can't reply to the RoP thread, you said, "Is this about the brown-skinned male Elf and the brown-skinned and beardless female Dwarf? This horse has been beaten to death and back on YouTube and Reddit, and the consensus is that it's fine and faithful to the texts." This is absolutely ASININE in how in accurate it is. The show is BARELY related to the works it is allegedly based on. And brown people aren't the problem, since Haradrim/Easterlings exist in Tolkien's Legendarium. Where is Celeborn? Why is Isildur around 1500 years before he was born? Why is Durin III Durin IV's father, when the dwarves only allow one person at a time to be named Durin due to their belief that each Durin is a reincarnation of the previous one? Why is Gil-galad able to pardon Galadriel (for killing orcs) when it was the Valar who banned her from Valinor, and she isn't pardoned until three thousand plus years later when she rejects the ring. Why is her motivation to get revenge when in the text her motive is to create her own kingdom to rule (kind of like Satan in Christian theology, which is why she's interesting, since her primary conflict is her own pride versus her own wisdom). Speaking of wisdom, why is she a petulant, hot-headed teenager when she's thousands of years old? Why is she going around hunting orcs when what she was doing on Middle-earth in the Second Age ruling various places and being immersed in Elvish politics (and of particular note, her issues with Celibrimbor and Annatar). Why is she going around swinging her sword like an anime protagonist? She might be tall and athletic, but there is scarcely anything written about her in battle, save for her bringing down the walls of Dol Guldur (which typically is done with magic, or a wrecking crew, not an anime sword). Why is she attempting to swim from Valinor to Middle-earth? Why is she in Númenor when she literally never went there?

There's a LOT more I could ask about her, and that's just one character. This show is fan fiction LOOSELY based on the writings. VERY loosely.

That is all. I'll enjoy my ban now.

>throughput of 2.5 simulated days per day Will need to be at least 10x faster to be operationally useful.
is fixed resolution really the way to go? I'd imagine big patches of the ocean and deserts being modeled as single nodes being way more efficient without compromising fidelity.
Although final output in those regions is perhaps less relevant, their state would still have an impact on other areas as the simulation progresses.
Why did they target 1.2km instead of 1? Or any other number?
I guess best ask the authors.

> Our ICON-ESM configuration is already used in production mode for scientific purpose with horizontal resolutions of 10 km, 5 km and 2.5 km. With the 1.2 km configuration we have now opened the door for a new class of numerical models which will allow us to investigate local impacts of climate change, such as extremes of precipitation, storms and droughts.

Some evidence of them using 10 km cells and then subdividing into halves, gets you down to 1.25 km.

The grid is not squared, is hexagonal, surely it relates to that.
How is this going to change seasonal-scale forecasts?
1.25 ~ 1.3: Round half to odd prefers preserving the existing scale of tie numbers avoiding out-of-range results when possible for numeral systems of even radix.