Hacker News new | ask | show | jobs
by briffle 1478 days ago
What blows my mind is the newest NOAA super computer (that triples the speed of the last one) is a whopping 12 petaflops. It comes online this summer.

It kind of shows the difference in priority spending, when nuclear labs get >1000 petaflop super computers, and the weather service (that helps with disasters that affect many Americans each year) gets a new one that is 1.2% of the speed.

https://www.noaa.gov/media-release/us-to-triple-operational-....

8 comments

The national labs aren't purely--or likely even mostly--dedicated to nuclear research. Instead, they cover a lot of the basic science research. These supercomputers will likely be used for projects like exploring cosmological models, or studying intramolecular interactions for chemical compounds, or fine-tuning predictions about properties of the top quark, etc.
Because these are so linked to research, everyone and their cousin is vying for time on them. Even though it may be massive, no individual will get anywhere near peak.
usually on the DOE machines some time is reserved for 'hero runs', but its generally only the classified stockpile work that qualifies
Oak Ridge in particular is in DoE Office of Science. They do some national security work, but their primary focus is basic science. Some of the national labs do primarily do nuclear weapons related research, but not Oak Ridge. Frontier is only doing unclassified work, primarily basic science and engineering.
Right; the labs most associated with nuclear work are LLNL and LANL. Both have had, IIRC, clusters configured for dual work- they could be partitioned between confidential and public work. The lab I wortked at, LBL, only did non-conf work but I know that LLNL took our codes and used them for nuclear simulations... errr, stockpile stewardship using multiphysics combustion codes.
Oak Ridge does not do dual-use on the big leadership facilities, it's not really feasible with how they operate. I think Frontier can technically handle "moderate" data (e.g. export controlled), but not classified. It's meant for open science.
Didn't LANL get renamed to LANS?
No, LANS was the name of the LLC that used to run LANL (which lost the contract a few years ago hence past tense), but the name of the lab itself is [still] LANL. The way the labs are run only a few people in the top leadership roles change if the management company running it changes; grossly oversimplifying they're basically interchangeable and come and go while the technical side of the labs themselves remain stable.
I didn't know that, thank you for clarifying for me!
Would a faster computer improve outcomes for victims of natural disaster? How much is left undiscovered about weather?

Research spending is based on the potential for discovery. As a species we have studied weather since the beginning of time. How long have we been doing nuclear research? A century?

Is there even an opportunity cost here? Or is it an economy of scale? As we build more supercomputers the costs go down. So NOAA and ORNL both get what they need for less.

> Would a faster computer improve outcomes for victims of natural disaster? How much is left undiscovered about weather?

The US is way behind on weather modelling, in part due to lack of computing power available to do the grids at sufficiently small cells compared to Europe and other parts of the world. That means less accurate predictions and less advance notice of impending disasters, which means more risk of loss of life and impact on infrastructure and the economy (and vice versa, inaccuracy can lead to more caution than is necessary, which has economic impact too). The US has to lean on Europe etc. for predictions.

https://cliffmass.blogspot.com/2020/02/smartphone-weather-ap...

Talks about the fact that IBM / Weather.com actually uses a more accurate system than the NWS uses, because the NWS is still stuck on GFS (been several years now since congress passed an act to force NOAA to update away from it, and unfortunately it takes time)

I've heard that was the case with the old GFS model. They just updated the GFS model in 2021 to provide higher accuracy: https://www.noaa.gov/media-release/noaa-upgrades-flagship-us....

I'm not entirely sure how it compared the ECMWF model during last years hurricane season, but I do think its improved substantially.

For comparison, the UK government Met Office installed a similar sized cluster of Cray XC40 machines about 6 years ago, with a 60 petaflop replacement arriving this year. Their forecasts are, anecdotally, locally considered a bit rubbish though.
You want a rubbish forecast? Just the other week, I got "no rain in your future" (24hr outlook) . I live in Seattle. It's spring. Of course there was rain.
This is an interesting claim. Could you share a reputable source on your claim that the US weather prediction facility is behind its European counterparts? How does the US depend on Europe for weather predictions?
Although meteorology is in many ways a much older science, I think you are underselling the difference (and importance of computers here). Better computing power means a more accurate forecast, but typically also a longer forecast horizon. That is critical when preparing for natural disasters and absolutely saves lives all the time.

Even at a 3-day lead time, GFS was still suggesting landfall for hurricane Sandy outside the New York region, the longer lead times provided by other centers (with more computing power) were very important for preparation [1].

Even on the science side, increased computing power enables a host of new discoveries. Even storing the locations for all the droplets in a small cloud would require an excessive amount of memory, let alone doing any processing [2]. Increased computer power enables us to better understand how clouds respond to their environment, which is a key uncertainty in predicting climate change.

Many disciplines of meteorology are also much newer than nuclear physics. Cloud physics (for example) only really got started with the advent of weather radar (so the 1940s). Before that, even simple questions (such as can a cloud without any ice in it produce rain?) were unknown.

Even today, we still have difficulty seeing into the most intense storms. You cannot fly an aircraft in there, and radar has difficulty distinguishing different types of particle (ice, liquid, mushy ice, ice with liquid on the surface, snow) and is not good at coutning the number of particles either.

Even after thousands of years, we are onlyjust now getting the tools to understand it. There is a lot left to discover about the weather!

[1] - https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/201...

[2] - https://www.cloudsandclimate.com/blog/clouds_and_climate/#id...

> Erik P. DeBenedictis of Sandia National Laboratories has theorized that a zettaFLOPS (1021 or one sextillion FLOPS) computer is required to accomplish full weather modeling, which could cover a two-week time span accurately.[121][122][123] Such systems might be built around 2030.

https://en.wikipedia.org/wiki/Supercomputer

One of the more commonly discussed values is predicting where a major hurricane makes landfall. We can't reliably do that yet, but if we could, evacuation zones would be both smaller & more effective.
You're quite right.

"An estimate of future HPC needs should be both demand-based and reasonable. From an operational NWP perspective, a four-fold increase in model resolution in the next ten years (sufficient for convection-permitting global NWP and kilometer-scale regional NWP) requires on the order of 100 times the current operational computing capacity. Such an increase would imply NOAA needs a few exaflops of operational computing by 2031. Exascale computing systems are already being installed at Oak Ridge National Laboratory (1.5 exa floating point operations per second (EF)) and Argonne Labs (1.0 EF) and it is likely that these national HPC laboratories will approach 100 EF by 2031. Because HPC resources are essential to achieving the outcomes discussed in this report, it is reasonable for NOAA to aspire to a few percent of the computing capacity of these other national labs at a minimum. Substantial investments are also needed in weather research computing. To achieve a 3:1 ratio of research to operational HPC, NOAA will need an additional 5 to 10 EF of weather research and development computing by 2031. Since research computing generally does not require high-availability HPC, it should cost substantially less than operational HPC and should be able to leverage a hybrid of outsourced, cloud and excess compute resources."[1]

[1]https://sab.noaa.gov/wp-content/uploads/2021/11/PWR-Report_2...

DOE computers are used by a wide variety of people/teams/projects, including academics and other institutions though.
This is a weird take. There are so many things behind the scenes to say anything conclusively. Different compute loads, different problem domains, different accuracy and predictability requirements, etc.

Cynicism is unwarranted, but it fits the current zeitgeist, biases and feels good.

> (..) when nuclear labs get >1000 petaflop super computers (..)

Would you prefer the research being performed based on empirical testing instead of running simulations?

IMO: we had good enough nuclear weapons 50 years ago to glass the whole planet, so why continue to try and improve a weapon of armageddon? Just maintain and build the same old nuclear weapons that are effective enough and try and remove the need for the weapons over time with the diplomatic and political process.
> IMO: we had good enough nuclear weapons 50 years ago to (...)

The money pouring into research disproves this.

In fact, it makes no sense at all to clam that our collective understanding of a phenomenon is already satisfactory and all research was already done after a few years of the first real world test.

For context, the Oklahoma City bombing was a few decades into the past but it still motivates a great deal of research in multiple research paths, even though none of it is rocket science or involves cutting-edge physics.

> The money pouring into research disproves this.

Yea, because money always goes to the most important and most useful research! /s

that's why the mission switched to "stockpile stewardship" some time ago - we have to maintain the reliability of the existing fleet since we don't build many new ones.
I am curious as to what class of problems are being solved on these super computers. Also whats the abstraction of computation here. Is it a container :-t :-t :-t
Weather modeling - X kilometers by Y layers of atmosphere can get expensive to compute really quick. And NOAA does more than just simulate weather, they're running climate/sea level rise/arctic ice modelling, aggregating sensor data from buoys/balloons/satellites, processing maps, and more.

I can't speak for NOAA, but my experience with supercomputing has been that there is no abstraction of computation, your workload is very much tied to hardware assumptions.

In my experience it's very hard to write code for parallel compute workloads and I am guessing that half of the code written would be creating abstractions about that.
They are used to do large-scale high-resolution analysis or simulation of complex systems in the physical world. The codes typically run on the bare metal with careful control of resource affinity, often C++ these days.

They aren't just used for global-scale geophysical processes like weather and climate or complex physics simulations. For example, oil companies rent time to analytically reconstruct the 3-dimensional structure of what's underneath the surface of the Earth from seismic recordings.

What do you mean bare metal? Because all of the big DOE computers are running Linux. Users will probably use hardware specific libraries (like CUDA/ROCm) and occasionally write some hardware specific asm, but none of the big computers are running without a POSIX OS.
I think they mean not in a VM, and instead using some job manager like slurm or condor. Typically users wouldn't have superuser privligles precluding the use of thigns like docker, which is why Singularity exists.
no, its a gang scheduled process - at least that's been the standard model. those processes are run either close to bare metal or as a process on linux. containers would be useful to package up the shared dependencies, so that may have changed.
The nuclear lab computers are also rented out to anyone who applies for an XSEDE grant. Anyone with a successful grant gets free access (obviously limited to a reasonable core-hors). Anyway, a ton of university researchers, all the way from materials simulations to weather groups will be using this computer to run their codes, as they have done for the last ones too.

In fact, such use accounts for the vast majority of the compute use.