Hacker News new | ask | show | jobs
by kristjansson 2617 days ago
This isn't about academic or mathematical rigor - this is about responsible communication of statistics.

You're right that a general audience isn't going to look at the source, nor think about variance of output variables. Therefore, it's the responsibility of us as communicators of statistics to relate the conclusions that can be drawn from the data in a way that first and foremost is not wrong or misleading, and secondly captures the concept as accurately as possible for the audience.

The first principle is the overriding obligation. Your simplification can capture as little of the information and conclusions supported by the data as you want, but it cannot imply or state conclusions that are not supported.

You're getting this response to your statement, from me and others, because your interpretation of the source can easily be read as drastically overstating the character (and strength) of the relationship supported by the data - even if that's not what you intended.

1 comments

To you this is the major responsibility, but not to everyone.

I'm getting this response to my statement, from you and many other data scientists, who can't accept oversimplifying math, but all of whom have failed to produce a general description in plain English.

In some ways it's like the test-- everyone hates it but no one has a better alternative. You've complained maybe 7 or 8 times in this thread about how scientifically inaccurate my general summary is but have not produced a description that a regular person could understand in 5 words or less, with no technical jargon.

‘FYGPA is ...

somewhat associated with

partially explained by the combination of

... HSGPA And SAT’

You think that's a good description?

"Something is partially explained by a combination of numbers"?

That says nothing semantic of value. Better to exaggerate the causal relationship and give a sense of meaning than offer meaningless generics like that, because at least the general reader intuits a sense of the universe of the relationship. The above implies nothing.