|
|
|
|
|
by ozgooen
2733 days ago
|
|
I'd definitely agree that correlations can be a really big deal, especially in very large models like that one. Guesstimate doesn't currently allow for correlations as you're probably thinking of them. However, if two nodes are both functions of a third base node, then they will both be correlated with each other. You can use this to make somewhat hacky correlations in cases where there isn't a straightforward causal relationship. Implementing non-causal correlations in an interface like this is definitely a significant challenge. It could introduce essentially another layer to the currently 2-dimensional grid. It's probably the feature I'd most like to add, but the cost was too high so far. I think Guesstimate is really ideal for smaller models, or for the prototyping of larger models. However, if you are making multi-million dollar decisions with hundreds of variables and correlations, I suggest more heavyweight tools (either enterprise Excel plugins or probabilistic programming). |
|
> where there isn't a straightforward causal relationship
One way to interpret a global pairwise correlation is simply that the person building the model is being systematically biased in one direction—either being too pessimistic or optimistic. This is a 'non-causal' relationship but often the biggest contributor to variance between the model and the real world.
Philosophically, this is a bit like the difference between 538's modeling approach and Princeton Election Consortium's for the 2016 election—the former gave Hillary a 2/3 chance of winning, while the latter ascribed a ~99% chance.
The risk of leaving modeling error out is that you'll end up with much more confidence than is called for—it feels very different to come up with a point estimate (I'll save $10k this year) vs. a tight range (I'll save 9k-11k this year), if the true range is much wider.
In the former case you know your point estimate may be very far off, but in the latter you may be tempted to rely on an estimate for variance that too low.
> It could introduce essentially another layer to the currently 2-dimensional grid
You could probably get away with doing almost all of this automatically for the user as long as the decide on what the 'primary' output is:
- For every input, calculate whether it's positively or negatively correlated with the output
- Apply a global rank correlation to all the inputs with all the standard techniques, flipping the signs found above as appropriate
- Report what the output range looks with a significant positive correlation (usually the negative correlation case isn't as interesting)