Hacker News new | ask | show | jobs
by idlewords 3222 days ago
The nomenclature is not the problem, the problem is making numerical estimates of low probability events based on just 150 years of data and a raft of modeling assumptions.
2 comments

The problem is not making numerical estimates of low probability events or how such events are modeled: it's completely ignoring the statistical probability distributions of the model. All of the models are extremely "long tail" distributions and just about entirely ignoring the long tail.

We shouldn't be referring to this as a low-probability flood but as a high sigma flood.

ETA: Disclaimer: Day job includes rainfall statistics analysis.

I don't know much about the field, but I'm curious. How much are estimates based on normal distributions, and how much does the model consider other distributions?

As you say, I don't think we know much about the tail. More than ten inches of rain may be a once a year phenomenon, but that once a year event might be twenty or fifty inches?

Disclosure: I am not a subject matter expert, I just do the programming they tell me to do. Mistakes/oversights here are likely my own, not my employer's or the software I work on.

The rain gage analysis the software I work on does is largely based on USGS data. [1] Almost all of that data is publicly accessible and you can explore the data down to individual monitoring stations if you wander through the site far enough.

The application I work with is primarily concerned with two bits of analysis from the data in a given station: average peak annual flow and mean daily flow. The distribution used for analysis of both (beyond linear interpolation and best fit line options) is fitting to a Gamma distribution [2], and plotted on a logarithmic scale. (Rainfall is specifically mentioned under applications of that distribution on Wikipedia, so it seems to be the industry standard even outside of the specific application(s) I work on.)

[1] https://waterdata.usgs.gov/nwis/rt [2] https://en.wikipedia.org/wiki/Gamma_distribution

I am not a math, but I remember that if you can't assume a normal distribution and want to know the shape of the tails, you need a pretty ridiculous number of observations.
Pretty much the same errors that were made in the 2008 financial crisis.
I don't think it's accurate to say we only have 150 years of data. We only have about 150 years of data collected in real time by humans, but we have plenty of other data. (Soil/rock strata,core samples, tree rings, etc.)

In some ways the geological and archeological evidence is more trustworthy than the evidence we've written down in the past century or two.

Do floods show up on the "geological and archaeological" records? Sure we can say that 50M yrs ago there was an ocean here, but that's a different statement than "in the last 1000 yrs there have been 14 floods that reached this height above the river".
Big ones absolutely do. I'm not sure how small floods can get before the answer is no. IMNAG.
No, it's not so simple either. If 200 years ago there wasn't a city there, then any kind of soil/rock strata, core samples, tree rings etc. likely won't show flooding, and therefore won't be representative of flood risk today.

Houston is a concrete jungle, which significantly increases the risk of flooding compared to even a decade ago.