Hacker News new | ask | show | jobs
by tedunangst 3222 days ago
I don't know much about the field, but I'm curious. How much are estimates based on normal distributions, and how much does the model consider other distributions?

As you say, I don't think we know much about the tail. More than ten inches of rain may be a once a year phenomenon, but that once a year event might be twenty or fifty inches?

2 comments

Disclosure: I am not a subject matter expert, I just do the programming they tell me to do. Mistakes/oversights here are likely my own, not my employer's or the software I work on.

The rain gage analysis the software I work on does is largely based on USGS data. [1] Almost all of that data is publicly accessible and you can explore the data down to individual monitoring stations if you wander through the site far enough.

The application I work with is primarily concerned with two bits of analysis from the data in a given station: average peak annual flow and mean daily flow. The distribution used for analysis of both (beyond linear interpolation and best fit line options) is fitting to a Gamma distribution [2], and plotted on a logarithmic scale. (Rainfall is specifically mentioned under applications of that distribution on Wikipedia, so it seems to be the industry standard even outside of the specific application(s) I work on.)

[1] https://waterdata.usgs.gov/nwis/rt [2] https://en.wikipedia.org/wiki/Gamma_distribution

I am not a math, but I remember that if you can't assume a normal distribution and want to know the shape of the tails, you need a pretty ridiculous number of observations.