Hacker News new | ask | show | jobs
by uniqueuid 247 days ago
It is immensely frustrating to me that to this day, after a solid century of advances in sciences and mathematical literacy, we are still implicitly stuck in the mental model of averages.

Reality is fucking far away from averages and we know it. "The economy is doing great/terrible" is an almost worthless indicator unless the person you're talking about actually has business relations into every corner.

Yes, there are interdependencies, but they do not justify that we pretend numbers are so expensive we can only print two of them (mean, sd) at a time. Let's finally stop drinking information through a 2 mile straw and instead show high resolution 2d data at least.

[edit] this is of course not a criticism of parent or OP, it's a systemic problem that we all are guilty of.

1 comments

I agree with all of this. Many is time I've had to tell developers I work with: "don't just look at the mean/median, look at a graph of the full distribution!... then slice your distribution a lot of different ways by all the tags/facets you have and look again at the slices." Often you find that a shift in the mean or median was driven by one particular class of data points that skewed the whole thing. (Looking at you, NVDA.) This is usually a little lecture I give in the context of performance engineering, where it's api response times or whatever, but it applies everywhere.

At the same time - and I think you agree with this and it's probably implicit in your comment - we have to beware of anecdata as well. "Two of my friends asked me for money" means very little, except that your friend group is having a rough time. The meso-scale, your "high resolution 2d data", is where to look if you want a textured picture of what's really going on while at the same time avoiding observer bias. Unfortunately, that kind of data is not always easy to get, or to interpret.