Hacker News new | ask | show | jobs
by therajiv 3254 days ago
Wow, the discussion on the Fukushima civil engineering decision was pretty interesting. However, I find it surprising that the engineers simply overlooked the linearity of the law and used a nonlinear model. I wonder if there were any economic / other incentives at play, and the model shown was just used to justify the decision?

Regardless, that post was a great read.

5 comments

In my opinion, the real problem in that case was not the overfitting, but that they extrapolated from that data. They didn't have anything above Magnitude 8. (https://ml.berkeley.edu/blog/assets/tutorials/4/earthquake-f...)

You should never, ever extrapolate. It doesn't matter what your model is, it won't work.

On a side note, it could be that there is a breakpoint at Magnitude 7.25, where the slope of the line really changes, and a segmented linear regression is appropriate (https://en.wikipedia.org/wiki/Segmented_regression). But we would need more data to be sure, anyway.

Not extrapolating isn't really an option in cases like this. You have to give some prediction for earthquakes of magnitude 9. Ultimately you must make a decision on whether to design for such an event.

But a sensible thing to do would be to draw many samples from the posterior distribution, instead of just using the maximum likelyhood estimate. That way the prediction accurately represents the uncertainty resulting from not having any data above magnitude 8 as well as, perhaps, your background knowledge that earthquakes of magnitude 15 never happen.

In retrospect they should have calculated both intercepts and taken the more pessimistic one. It's surprising they did not. However this could've been a decision based on the cost. Still weird that wasn't explicitly called out. Maybe it was.
Most likely, since building a facility to survive a 2.5x stronger shake would surely be a lot more expensive.

I was also curious about how the data in the past few years did not follow the same trend as before. Does anyone know if that is what geologists call to be 'overdue' to an earthquake? Like California is supposed to be for a while?

Well, the data wasn't showing that the past few years were anomalous; rather, there were fewer high-magnitude earthquakes than expected. I don't think this has anything to do with being overdue for an earthquake. Most likely this is just because with events of low frequency (e.g. these higher-magnitude earthquakes were predicted to occur once every ~100 years by the linear model), large percent deviations from the expected value are more probable. Basically if you flip a coin 10 times you might imagine that 3 heads and 7 tails is pretty common, whereas 300 heads and 700 tails on 1000 tosses is comparitively extremely unlikely.
My point was that maybe for high mag quakes the power law is invalid... Or at least I dont think we have enough data at this end to be certain of what is going on.

Here's another plot, this time from UK seismic frequency, where again the frequency for high magnitude earthquakes seem 'under' the expected curve. Yet, again, these are 2 plots...

http://www.quakes.bgs.ac.uk/hazard/Hazard_UK.htm

Actually it seems so. I'm no geologist but a quick google search for Gutenberg–Richter plots show that this 'kink' can have a very specific physical reason:

http://www.frontiersin.org/files/Articles/233038/fbuil-02-00...

Honestly, the nonlinear model looks far better to my eye than the linear model. The error term of the linear model seems obviously dependent on X, which contradicts the notion that the linear model was "correct". I think the article does a disservice to the reader by oversimplifying and calling the linear model "correct".
As therajiv pointed out, there were only a handful of data points supporting the kink in the curve, down at its end, versus lots and lots for the main line. Even in freshman physics labs you get told it's a terrible idea to extrapolate from a few points at the end of a curve, because those are typically the noisiest.
Certainly. But the issues of a linear model being "correct" and how to extrapolate forward are not the same.
I would like to believe that the engineers working on building a nuclear reactor were not so easily fooled by an over-fitting problem; if that were the case, wouldn't that mean we now have actionable information on whether other nuclear power plants are designed correctly?
it is interesting for sure - but, I have to point out that the containment systems actually did survive the quake. They didn't adequately plan for the tsunami. There may have been no distinction in the analysis, but I wouldn't infer that surviving the quake, and the survival of a potential tsunami were the same piece of analysis.