Hacker News new | ask | show | jobs
Why Economic Models are Always Wrong (scientificamerican.com)
74 points by Sato 5344 days ago
21 comments

I think what the author is describing is simple overfitting.

http://en.wikipedia.org/wiki/Overfitting

It is quite a newbie mistake for a scientist to be surprised by it. It affects every kind of modelling.

I thought maybe this article would talk about why economic models are worst than other kinds of models. There are issues that arise when applying scientific models to the economy caused by the fact that when even good models are used to predict markets, the use of the models themselves to do trading, distorts the markets. When multiple parties use good models to compete in markets, they distort the markets in such a way that destroys the predictive power of the models.

There is a great explanation by Glen Whitman of Agoraphilia, that uses grocery line wait time predictions as a metaphor for this:

http://agoraphilia.blogspot.com/2005/03/doing-lines.html

See also:

http://lesswrong.com/lw/yv/markets_are_antiinductive/

http://en.wikipedia.org/wiki/Efficient-market_hypothesis

Alternatively, it may be simple information theory: A model that takes in 100 bits of specification simply can not correctly describe a process that has 10,000 bit's worth of degrees of freedom. And that's before we talk about iteration over time, and before we get to the final killer you mention, which is when the models are ruined by their own application to the domain.

I think radical underspecification is much more likely than overspecification, really.

(Since I encounter this a lot, let me pre-answer one question in advance, which is "What if only 300 bits really matter and the rest don't matter as much?" and the answer is that the term bit in information theory encompasses that idea already. If you have ten "bits", but they tend to be highly correlated together such that they are usually all 0 or all 1, you in fact don't have ten bits in information theory. Ten bits are, by definition, ten fully-independent true or false values. Bits-in-memory are not the same as information-theory-bits. A real system with 10,000 bits can not, pretty much by definition, be modeled by 100 bits. If it could, it would be a system with only 100 bits in the first place. Information theory cares about the true degrees of freedom available, not about your particular representation of the system.)

Here's the thing: you're both right. It's both radically underspecified and overfitted. The information-theoretic argument demonstrate that a model cannot exactly match the reality unless it's as complex as the reality.

This article speaks of the separate problem that economic models are not evaluated in any sort of experiments, and thus are prone to overfitting. This makes them unlikely to even approximate well.

Consider a basic multilayer perceptron-style neural network. Overfitting is a well-understood problem in training an MLP. We work around it by training on a part of the data, and then measuring its accuracy on another part -- much as Carter did in his analysis. If the accuracy is poor, something is adjusted: the size of the hidden layer can be increased, the training set expanded, the duration of the training increased or decreased, or the MLP model discarded entirely.

If increase of the training set or reduction of the duration improves accuracy against the test set, this means we had an overfitting problem.

"It's both radically underspecified and overfitted."

He used a perfect model (of a hypothetical world) which had exactly the right parameters, and then he calibrated it using exactly correct data.

So I don't see how this could be underspecified or overfitted. Can you please explain?

"The information-theoretic argument demonstrate that a model cannot exactly match the reality unless it's as complex as the reality."

In this case he defined his model to be reality.

Those particular statements referred to some representative economic model, not the experiment in question. In the experiment in question, the model is fully specified by definition.

As far as overfitting goes, that applies when you have a parameterized general model and need to discover the correct parameters. You probably won't get the exact correct parameters; instead, you'll (hopefully) get parameters that approximate reality well.

More closely matching the training data can actually make it a worse approximation in the general case.

"The information-theoretic argument demonstrate that a model cannot exactly match the reality unless it's as complex as the reality."

What if reality is self-similar at certain scales? You could generate something that resembles the whole from one part of it.

Then the reality is, information-theoretically, less complex and you can use a less complex model to represent it.
Re. your eloquent bracket: the "true number of bits", as in "the most compact description of any given state of a system", is in general uncomputable (see Kolmogorov complexity), for any sufficiently powerful language of description.

If you require notions such as "the true minimum number of bits" to be practical, you have to put additional restrictions on the language by which you describe the system -- such as your probability model. The representation does matter.

You're right, and overfitting cannot be an explanation for this phenomenon -- when there are many equally valid alternative outcomes to a problem, which is what's being described, the solution is underdetermined by definition.

In the (ML) terms I'm used to, it is as an error surface with many local minima. That is, if you start out with a guess for the parameters and try to progressively optimize the cost function to reach a point where the error is lowest (i.e. the tangent of the error is 0), where you end if is extremely dependent on where you start out. When you find a local minimum, you have found a point where there is no nearby point that is better, but there may be some other point (or many) somewhere else in the model that is better. The very best one is the global minimum.

This is a well known problem in ML for non-convex error functions, and there are various methods for trying to avoid local minima and reach a global minimum.

But this case is actually worse than that -- it is an error surface with many global minima. Each is effectively a perfect fit for the data to date, but give different predictions about future data. Since each function is a perfect fit, it is literally impossible to predict the proper parameters. Which is what underspecification is.

A model that takes in 100 bits of specification simply can not correctly describe a process that has 10,000 bit's worth of degrees of freedom.

If I'm correct, though, the OP is talking about creating a model with 100 bits of specification, and then creating a model of that model and trying to train those 100 bits, which seems like it should be a more tractable problem.

To me it sounds more like he's just rediscovered the fact that when you try to set a model's parameters based on a limited set of observations (he generated 3 years worth of data from his model, then trained parameters based on that data), there's a lot of uncertainty left over, and you won't necessarily get the right model.

This is quite obvious - if your observations only cover a limited portion of phase space, then you shouldn't be surprised that in a complex enough model multiple parameterizations will fit the observations equally well. You just didn't have enough freaking data to distinguish between the models! In all branches of science, we deal with this problem, and the solution is that you try to find the simplest possible model that accurately explains your data (or, as is happening in physics right now, you try to enumerate the next level of theories that reproduce current data so that you can figure out which experiments you'll need to run to distinguish between them).

So this has doesn't hint at any sort of fundamental flaw with modeling in general (and yeegads, it has even less to do with finance...) - it's just that he didn't have enough data to infer a proper parameterization. Don't build complex models and expect to train them on small datasets...

"I think what the author is describing is simple overfitting."

It doesn't look like overfitting to me. The input data is perfect, and the model is perfect, so it doesn't look like overfitting can occur.

This is known to anyone who's ever monkeyed with any type of machine learning: genetic algorithms, Bayesian filters, anything.

I agree with many of the commenters in this article. This should be common knowledge.

I also, like many commenters, couldn't help but think of model-based climate predictions.

The problem is that you're comparing statistical methods with process-based methods. The mathematically inclined tend to have a reflex to approach modeling wiht this sort of black box methods. The thing is that for modeling processes like geomorphology, hydrology but also less quantitative processes like quality of life in urban environments, black box methods cannot be verified nor reasoned about - with issues like overfitting etc. becoming a problem.

On the other hand, you can model by building conceptual models, calibrating them by hand (using computer methods for the number crunching only) and reasoning about divergences between model results and observed data rather than computing them away with raw power. This is what modeling should be about - a tool for understanding.

(this topic is dear to my heart - I have had this discussion so often. Models are not crystal balls, they are tools for understanding processes. Which is why I am so desperate when another economist, mathematician or computer scientist stands up and wants to model processes that require understanding with their barbaric brute force statistical methods to not have to study things that are outside of their comfort zone. When all you have is a hammer etc.)

"All models are wrong. Some models are useful." - George Box
This article is more about how multiple sets of parameters can fit the same data equally well. This is why economists draw a distinction between calibration and estimation. If a parameter is "identified" in some estimation procedure, they mean they have an experiment or quasi-experiment that gives them a credible CI for the true parameter.
I think that with economic models used for trading there is also another big problem: Their application changes the model itself. So, even if you had a perfect model for the market without you applying your model, as soon as you start applying it, the market changes... and this is also true for all the other quants who do the same with their models.

IMHO, it was much better when most stock market decisions were mostly based on "fundamentals". Because that way the market was incentivising sound business decisions.

> So, even if you had a perfect model for the market without you applying your model

Actually, most trader's models do take market impact into account. If you had a perfect model for the market, I'm pretty sure that you (as a participant) would be included. In fact, your own actions are the easiest part of the model to get right, because you control them entirely.

Ok, but you would need to take into account the interaction with other traders' models that are put into play all the time...
yup, building good models is hard. doesn't mean they always produce bad results. billions of dollars of quant hedge fund money prove that.
"essentially, all models are wrong, but some are useful" George EP Box - one of the greatest truths imho :)
'it was much better when most stock market decisions were mostly based on "fundamentals"'

I don't recall such a period. Is there a particular interval you're thinking of?

I've thought there was more opportunity in fundamentals up until Warren Buffet and Ben Graham's the intelligent investor became well known. More people tried to use these methods, thereby increasing demand and decreasing the upside on securities that meet Graham and Buffets criteria. The stock market today is very different from when they got going, although long term I don't know that anything has fundamentally changed, even before robot traders there had always been random and unexplainable noise.
Wait, if there was more "opportunity in fundamentals" back then it would mean that stocks were further away from their fundamentals, right? That's pretty much the opposite of what the OP is complaining about.
This was probably the most insightful thing I've seen all week. Thanks for making me smile.
Ha, yes, you're right.
I'm actually not thinking about a specific period, I was just referring (maybe naively) to the time before computer assisted analysis became so widespread (before the eighties, I guess).
People, not computers, make financial crises.

http://en.wikipedia.org/wiki/List_of_economic_crises

There is a good reason for the stock market to have become so complex, it's become so that few people can really understand how it works, how to gain from it and who plays with it.
Great discussion! The author doesn't seem to introduce the concept of training/testing datasets which absolutely critical to obtaining any reasonable model. So I don't buy the author's thesis that economic models are always wrong.

The solution to the hypothetical problem posed in the article is to separate the historical dataset into training and testing groups. The models should be generated while only 'seeing' the training data. You will, as the author mentioned, get many models that appear to fit the data. Most of these models will be garbage.

The fun part is when the testing data is introduced against the many models generated above. Most of the models will completely bomb, but a handful may actually predict the previously 'unseen' testing data with high accuracy. Those few models which pass the testing stage are the ones worth their salt.

Due to the self-aware nature of the markets, successful models probably will not be true indefinitely, but it's very possible they may be true long enough to be profitable. The less known your successful models are, the longer they will be successful predictors of the market. Hence why successful quant funds are notoriously secretive with their approaches. Open source would never work in finance.

"training/testing datasets which absolutely critical to obtaining any reasonable model"

This is partly correct but, in general, too strong.

Am I commenting on the OP? Not really!

Why too strong? Because it assumes too little and sometimes more information is available and with the extra information a 'testing data set' may not be needed.

Why are 'testing data sets' important? If about all you have to go on is the 'historical data' and then are just searching for a 'model' based mostly just on what 'fits' the data, then, sure, a 'testing data set' will likely be just crucial. One way to get such a 'testing data set' is to partition the 'historical data' into two parts, use the first to 'fit' a model and the second to 'test' the fit. Of course, there are still risks: If fit 10,000 models, find 10 that fit well and test each of the 10 with the 'testing data set' and accept the model that fits the testing data the best, then still may have some problems from a 'generalized version of overfitting'! As I recall, there has been some mathematical statistics to address this issue.

Where can get by without a 'testing data set'? Broadly if know more than the meager assumptions common in 'machine learning' or 'curve fitting'.

What more can be known? In principle the variety is large.

Examples? Sure: Broadly just simple, old 'regression analysis', looked at as statistical estimation, makes a long list of quite detailed assumptions. E.g., we assume that there is a model the works and that we know in good detail the form of that model. We assume a lot about the 'historical data' we have, E.g., we assume 'homoscadasticity' and mean zero, independent and identically distributed (i.i.d.) Gaussian for the errors. We make some assumptions about dimensionality (e.g., to get around 'overfitting'). Then the usual derivations give minimum variance, unbiased estimates of the unknown parameters and more, all without any use of 'testing data'. "Look Ma, no testing data required!".

"Yes, son, but as your father kept telling you, a LOT of assumptions are required, and the assumptions are not all easy to verify. Or the regression derivations are a nice logical trip from island A to island B we would like to get to but we don't always know how to get to island A.".

Other examples? Sure: Calculate the trajectory of a space craft doing 'slingshots' in the inner solar system and then reaching, say, Saturn. We start with Newton's second law, his law of gravity, maybe a little about the solar wind, a lot of details about the orbits of the planets, and do some good numerical work with an initial value problem of an ordinary differential equation. We build a 'model' but don't really 'fit for parameters' or use 'historical data' and have no real use for 'testing data'. Why? Because we believe in Newton's laws and our numerical work. A 'model'? Yes. Fitting 'parameters'? No,

Can there be a connection between space craft trajectories and economic models? Sure: Bring more assumptions than just curve fitting. An example is to bring, essentially, accounting. So, then can get a Leontief input/output model. We bring basically just accounting data and not other historical data, do no real 'parameter' estimation, and use no 'testing' data. If the input data is noisy, then, sure, so will be the output and we might do some work with confidence intervals. Still we don't check with 'testing data'.

More examples? Sure: The broad field, with many techniques, of distribution-free statistical hypothesis testing is based on historical data and some assumptions and really needs no testing data. What is obtained is much like a 'model' where can plug in new data and get the intended results. The assumptions are typically that the data is i.i.d.

Net, a lot can be done beyond the common approach of machine learning curve fitting.

As usual, an excellent summary. Economic models based on low level data (essentially better "instrumentation", capturing bank transactions, some contracts, individual spending etc.) might be quite useful at least for short term prediction. Perhaps old ideas of "optimal control" can be to some extent realised.
Every model is "wrong", by definition of it being a "model" and not "reality". It's one of the few mind opening things I've learnt at university.

That's not a problem if you take it as an incentive to improve how much you know about the real world. It's a problem when you put the model before the people, and say that "models got us in trouble because of calibration problems".

An economic crisis is not an unavoidable natural disaster, it's people screwing up other people.

Simple and to the point. Couldn't agree more.
This article is avoiding terminology, data and any specifics on the problem that it renders it useless.

You might be fooled it says something useful if you don't know what a 'model' means in any science.

So what is the point of the article? The author is trying to sell you his book where he most probably makes people who don't know anything about economics feel good or push an ideological agenda.

It's not so much useless, it's just far more general than the author probably intends.

All his arguments apply equally well to any scientific models which require fitting, in geophysics (as he acknowledges), atmosphere/ocean science, climate modelling, most of biology, ecology, etc.

Why he singled out economics is beyond me.

The author is only partially right. The mistake is in defining a closed system that is in fact not closed, and then curve fitting.

For instance a great part of growth in the last 100 years has been from man's ability to harness energy from fossil fuels. If your time line is narrow enough, you can disregard the point that fossil fuels is not unlimited, and project continued rise in extraction.

Another example is the baby boom, and the introduction of women into the paid work force which led to continued rise in property prices.

One more is the introduction of laws which suddenly compel people to invest in the stockmarket. It leads to short term asset inflation but generally makes worse investment all round.

That said, it is fitting that an economy is well modelled using the principles of hydraulics. See http://en.wikipedia.org/wiki/MONIAC_Computer

The problems with economic model or most modeling are not the methods. It's usually dealing with the quality of the features or parameters. Even in a much simpler problem, no matter how good the methods, if you don't have the right params, your model will suck. And with economic models, it's dealing with a open world system with ever changing params, the challenge is not on the methods, but how to discover quality parameters/features. And that require not just the skills of modelers but many other disciplines.
Financial-risk models got us in trouble before the 2008 crash

Is this accurate? I remember reading that all the alarms were going off, they were just ignored or the models were "adjusted".

Carter's papers on the subject:

Ballester, P. J., & Carter, J. N. (2006). Characterising the parameter space of a highly nonlinear inverse problem. Inverse Problems in Science and Engineering, 14(2), 171-191. doi:10.1080/17415970500258162.

Ballester, P., & Carter, J. (2007). A parallel real-coded genetic algorithm for history matching and its application to a real petroleum reservoir. Journal of Petroleum Science and Engineering, 59(3-4), 157-168. doi:10.1016/j.petrol.2007.03.012.

A great quote from George E P Box:

All models are wrong. Some are useful.

So are those types of models useful? The ones that need to be adjusted constantly?
Actually, even correctly parametrized, any predictive model will suffer from the paradox of the oracle : if you have a "oracle" capable of anticipating the decision of an actor, and that this actor knows about the prediction, this actor can make the prediction false.

In economy, some actors have an interest in faking the prediction, even if it is costly for them : it is often valuable to be unpredictable.

Is this really surprising? I would have thought this would be self-evident as these kinds of models would seem to be highly chaotic.

It's really no different than the meteorology simulations in the 60's that first discovered the butterfly effect.

http://en.wikipedia.org/wiki/Butterfly_effect#Origin_of_the_...

I think the butterfly has gone extinct when it was replaced with CO2.
A "scientist" finds by cross-validation that his model is over fitting the data. Luckily it wasn't published by a reputable source of science journalism.

http://en.wikipedia.org/wiki/Cross-validation_(statistics)

Also who the heck is Wilmott? He just pops up in the last paragraph with no introduction.

He's pretty well know in the Quant community : http://wilmott.com/about.cfm , but I also had to double-check the article to see where he was introduced. I guess there was some heavy-handed editing.
Macroeconomics resembles a science in exactly two ways: it looks at history, and it makes predictions (or prescribes courses of action; these are equivalent).

Greek mythology resembled a science in those same two ways.

Economics ... makes predictions (or prescribes courses of action; these are equivalent).

This is absolutely false, except at the micro level. In terms of policy, economics can only inform us of the relative costs of various alternatives. It cannot tell us which alternative is right.

Consider the question of free trade. Virtually every economist agrees that free trade improves total efficiency (viz the Law of Comparative Advantage). However, at the margins, it may harm some individuals. Economics cannot tell us if it is morally right to incur those individualized harms in order to improve the lot of the whole, nor what if anything we should do to make whole those who were affected.

Economics makes predictions, giving us insights. Our morals are then needed to prescribe courses of action.

You are right. To prescribe a course of action is to make a prediction (that this action will lead to results in some way superior to alternative courses). The reverse, as you correctly point out, is not necessarily the case.
Stop tarring microeconomics with the brush you used on macro.
You're right, updated.
Isn't this just an instance of a chaotic system, in which the parameter settings that almost match the historical data will inevitably diverge because of sensitive dependence on initial conditions.
From scientist to scientist a little secret: All models are always wrong! If the model would be correct it would be as detailed as reality and thus also as useless.
This is perhaps naive, but why are the parameters to a model not considered as part of the model as a whole?
In the end it's a matter of convention. If you think about Newton's "model" of gravity, for example, you'll notice that the formula that describes the gravitational force can be plausibly explained based on intuition. However, the gravitational constant (i.e. the parameter) has no explanation. It just is.

Of course, a great deal of physics is ultimately about trying to make the parameters go away by explaining them using more fundamental models. But at any given level of abstraction, you'll have parts of the model that are reasoned intuitively, and parts of the model that just are the way they are, for no good particular reason other than that's what you happen to get by measuring.

Can you really compare Economic models to Physics models without discussing the simplifications necessary to create an Economic model?
You usually have to make a large number of simplifications to create a Physics model too. The question is how those simplifications change the accuracy of the model.

"Essentially, all models are wrong, but some are useful" -George E.P. Box

We can generalize to more than just physics and economics... As the quote says, ALL models are wrong (no matter what field). If you have a model that is "right" then it isn't a model, is it?
Assume a spherical cow...