Hacker News new | ask | show | jobs
by _dps 1899 days ago
It is absolutely untrue that DL is immune to fat-fail problems, and it is important that no one operate mission critical systems under this assumption.

The two fat tail questions one has to engage are:

- is it possible that a catastrophic input might be lurking in the wild that would not be present in a typical training set? Even with a 1M instance training set, a one-in-a-million situation will only appear (and affect your objective function) on average one time, and could very well not appear at all.

- can I bound how badly I will suffer if my system is allowed to operate in the wild on such an input?

DL gives no additional tools to engage these questions.

2 comments

> It is absolutely untrue that DL is immune to fat-fail problems

In fact, working on fat tail problems is currently a hot topic in ML.

I don't quite follow: is not what you described a flaw fundamental to all forecasting; that is, the occurrence of a gross outlier? I should clarify that DL doesn't suffer from the same problem the normality condition has on fat-tails: a failure to capture the skew of the distribution.
It's not characteristic of all forecasting, only purely empirical forecasting.

Definitionally, the only way to reason about risk that doesn't appear in training data is non-empirical (e.g. a priori assumptions about distributions, or worst cases, or out-of-paradigm tools like refusing to provide predictions for highly non-central inputs).

DL is not any better (or worse) than any other purely empirical method at answering questions about fat-tail risk, and the only way to do better is to use non-empirical/a-priori tools. Obviously the tradeoff here is that your a priori assumptions can be wrong, and that too needs to be included in your risk model (see e.g. Robust Optimization / Robust Control).

I think it's wrong to assume that non-empirical methods can be reliably trusted to give better results. Humans are terrible at avoiding bias or evaluating risks, especially for uncommon events.
Food for thought: if every method for predicting event x is terrible, then you might as well not try to predict x and build your life in such way that you never expose yourself to the risk of x happening.
From a Bayesian point of view, that amounts to a "prediction" that the probability of event x is so significant that you should build your life around it. But I guess if you knew enough for that sentence to make sense you wouldn't have posted your comment. So, suffice it to say that Bayesian decision theory cuts the knot you're talking about.