Hacker News new | ask | show | jobs
by meric 5023 days ago
Correlation is statistical evidence.

You broke your own example by inserting the word "cause" into the sentence. Of course your example is a cause-effect relationship.

Statistical evidence suggesting correlation of two factors A and B are enough for one to say "The presence of A predicts B" as well as "The presence of B predicts A".

Please, don't warp the meaning of the word "predict".

"I predict tomorrow is going to rain" There is no way that sentence suggests a cause-effect relationship.

"This specific color pattern in the image is a great predictor of the presence of rain when the photograph was taken" Neither does this sentence suggests a cause-effect relationship.

No one here is confused between correlation and causation. I'm simply insisting the word "predict" has to do with correlation, not causation.

1 comments

> Correlation is statistical evidence.

Yes -- evidence of a correlation, not a cause-effect relationship.

> Of course your example is a cause-effect relationship.

Puddles and rain? Yes, but it is only a description, not an explanation. Science requires explanations. Otherwise we open the door to pseudoscience, to people claiming any associations they care to claim.

> Please, don't warp the meaning of the word "predict".

It is you who is doing that -- look at the definitions at the bottom of this post.

Here's another example. I have a cure for the common cold -- I shake a dried gourd over the patient's head until he get better. Sometimes it takes a week, but my treatment always works. The correlation is perfect, therefore I deserve a Nobel Prize for ridding the world of this scourge.

My dried gourd treatment "predicts" that the cold sufferer will get better -- always.

> I'm simply insisting the word "predict" has to do with correlation, not causation.

And you are mistaken. Rain is a predictor for bumper crops, but bumper crops are not a predictor for rain. Teenage driving is a predictor for car crashes, but car crashes are not a predictor for teenage driving.

A recent bogus study found a correlation between marijuana use and lower IQ. But the marijuana use did not predict the IQ drop, it was only correlated with it, and the researchers included this fact in their article. Needless to say, the science journalists ignored the qualifiers in the article and announced that marijuana use predicted a fall in IQ:

http://abcnews.go.com/blogs/health/2012/08/27/teenage-mariju...

Here's another account of the same study that makes a claim in its title that the article body contradicts. Title: "Smoking Pot In Teen Years Lowers IQ Later". A quote from the article: "But those who consistently smoke marijuana may simply make less intellectually stimulating choices at critical points in life."

Here is another bogus study: "Low I.Q. Predicts Heart Disease":

http://well.blogs.nytimes.com/2010/02/10/low-i-q-predicts-he...

Except for the fact that it's only a correlation, and use of the term "predicts" is nonsense. Needless to say, the article doesn't consider that the low IQ might predict the heart disease, not the reverse.

> I'm simply insisting the word "predict" has to do with correlation, not causation.

Yes, and you are mistaken.

http://dictionary.reference.com/browse/predict

"to declare or tell in advance; prophesy; foretell: to predict the weather; to predict the fall of a civilization."

http://www.merriam-webster.com/dictionary/predict

"to declare or indicate in advance; especially : foretell on the basis of observation, experience, or scientific reason"

Q.E.D.

You can't try to refute only some of my points and not others.

I meant to say your example of course implied a cause-effect relationship, since you added the word 'cause' to it. (of course it wasn't a cause-effect relationship, do you think I am stupid?)

>> My dried gourd treatment "predicts" that the cold sufferer will get better -- always.

If your treatment is indeed correlated with a cold sufferer getting better, significantly enough, then yes, your treatment does in fact predicts it, even though it might not have been the cure. This is consistent with the definition of prediction being only related to correlation.

>> Rain is a predictor for bumper crops, but bumper crops are not a predictor for rain. Teenage driving is a predictor for car crashes, but car crashes are not a predictor for teenage driving.

That is correct; since bumper crops were only aided by rain if factors X1, Y1 are true, bumper crops are not a predictor of past rain unless X1, Y1 is true. Car crashes are correlated with teenage driving if factors X2, Y2 are true, car crashes are not a predictor of teenage driving unless X2, Y2 is true.

X1, Y1 = (Crops are water dependent, crops were not also thoroughly irrigated by other means)

X2, Y2 = (Teenager was intoxicated, teenager riding in car with more than 3 other members, all male)

>> But the marijuana use did not predict the IQ drop, it was only correlated with it, and the researchers included this fact in their article.

Yes it did, correlation means you can use the statistics to predict it, even if it wasn't the cause.

Predict means correlation, and correlation means what you think correlation means. You have the meaning of predict wrong. It has as much to do with causation as correlation has. i.e not much (though some nonetheless, since if there is a causation it is likely to be correlated).

"In statistics, prediction is a part of statistical inference. One particular approach to such inference is known as predictive inference, but the prediction can be undertaken within any of the several approaches to statistical inference. Indeed, one description of statistics is that it provides a means of transferring knowledge about a sample of a population to the whole population, and to other related populations, which is not the same as prediction over time."

http://en.wikipedia.org/wiki/Prediction#Statistics

If you are talking about prediction and correlation in the same sentence, you are in the realm statistics and therefore should abide by its use of language.

You also did not address my two examples of use of predict in laymen sentences showing the word "predict" did not have anything to do with causation.

>> "to declare or tell in advance; prophesy; foretell: to predict the weather; to predict the fall of a civilization."

>> "to declare or indicate in advance; especially : foretell on the basis of observation, experience, or scientific reason"

Neither of those definitions have any semblance of implying causation. Observation, experience, or scientific reason can all be instances of correlation.

> You can't try to refute only some of my points and not others.

Of course I can. I chose only those points where your logical errors were most obvious.

> If your treatment is indeed correlated with a cold sufferer getting better, significantly enough, then yes, your treatment does in fact predicts it, even though it might not have been the cure. This is consistent with the definition of prediction being only related to correlation.

No, that is false. Shall I list the definitions for "predict" again, or will you read them again on your own?

> You also did not address my two examples of use of predict in laymen sentences showing the word "predict" did not have anything to do with causation.

You quoted some laymen who suited your purpose, while I listed the dictionary definitions for the word, the definitions that show that "predict" means to assert an effect based on a cause.

> Neither of those definitions have any semblance of implying causation.

Oh, really?

"to declare or indicate in advance; especially : foretell on the basis of observation, experience, or scientific reason"

A prediction is therefore the use of observations of A to predict B, to show a cause-effect relationship. I see the disappearance of the middle class (A), and on that basis I predict the fall of civilization (B). I see gathering clouds (A), and on that basis I predict rain (B) -- and puddles (C).

> Observation, experience, or scientific reason can all be instances of correlation.

Yes, but it's a false analogy with no bearing on this topic. A prediction forges a link between an observation (A) and an outcome (B). It assumes a cause-effect relationship, one that may not be real, but a word isn't responsible for how people misuse it.

> Indeed, one description of statistics is that it provides a means of transferring knowledge about a sample of a population to the whole population

Yes -- an observation of a small sample (A) is used as the basis for a prediction about the population as a whole (B). Also, remember that "prediction" commonly refers to an assertion about the future (B) based on present observations (A).

http://dictionary.reference.com/browse/predict

"(Verb) to foretell the future; make a prediction."

> Yes it did, correlation means you can use the statistics to predict it, even if it wasn't the cause.

Nonsense. Marijuana use doesn't predict an IQ drop, the study doesn't support that prediction, as the authors were careful to point out, and as the journalists were at pains to ignore.

The marijuana use, and the IQ drop, are only correlated -- one does not predict the other.

>> A prediction is therefore the use of observations of A to predict B, to show a cause-effect relationship. I see the disappearance of the middle class (A), and on that basis I predict the fall of civilization (B). I see gathering clouds (A), and on that basis I predict rain (B) -- and puddles (C).

You are using the temporal sense of the word "predict", not the cross-sectional sense. Just because you can use the word predict when there is a cause-effect relationship doesn't mean you can't if there isn't. Here is an example illustrating this: I see lots of graffiti in the town (A) and on that basis, predict this town has a high crime rate (B). Notice this prediction was made independent of time.

>> A prediction forges a link between an observation (A) and an outcome (B). It assumes a cause-effect relationship, one that may not be real, but a word isn't responsible for how people misuse it.

A prediction forges a link between an observation (A) and an outcome (B) to explain a correlation relationship, which may or may not be because of a cause-effect relationship.

>> Yes -- an observation of a small sample (A) is used as the basis for a prediction about the population as a whole (B). Also, remember that "prediction" commonly refers to an assertion about the future (B) based on present observations (A).

Let's say I am a statistician and after surveying 10% of the population, found out lower income earners are correlated with a lower IQ. I use this observation as a basis for a prediction about the population as a whole - that lower income earners can predict a lower IQ. Notice again, time is irrelevant.

>> Also, remember that "prediction" commonly refers to an assertion about the future (B) based on present observations (A).

A word may have more than one sense. I am talking about the word prediction as used in statistics.

>> The marijuana use, and the IQ drop, are only correlated -- one does not predict the other.

If they are correlated then you can use the evidence in the sample to make general predictions in the population (independent of time), provided your experiment methodology was valid.

> You are using the temporal sense of the word "predict" ...

Yes -- that's because that's how the word is defined.

http://en.wikipedia.org/wiki/Prediction

"A prediction (Latin præ-, "before," and dicere, "to say") or forecast is a statement about the way things will happen in the future, often but not always based on experience or knowledge." (Emphasis added.)

> A word may have more than one sense. I am talking about the word prediction as used in statistics.

Yes, all right. Statistics uses the word in the same way, for the same purpose -- as a description of a forecasting method, a statement about the future based on past and present data. Consider the various regression-based prediction methods that are, by definition, statements about the future, based on the past.

http://www.isixsigma.com/tools-templates/risk-management/use...

"Forecasting is a business and communicative process and not merely a statistical tool. Basic forecasting methods serve to predict future events and conditions and should be key decision-making elements for management in service organizations."

Shall I list ten more references that make the same point about statistical prediction? How about just one:

http://en.wikipedia.org/wiki/Regression_analysis

"Regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning."

> If they are correlated then you can use the evidence in the sample to make general predictions in the population ...

Only if you don't understand science. Correlation is not causation.

Prediction and forecasting are two different things in regression.

Regression is for predicting a dependent variable based one or more independent variables. It may or may not involve a time component.

"Regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning."

Exactly. Prediction and forecasting are not the same thing. That's why they had to state both.

I know your bio is impressive, but I've studied statistics for two years, specifically time-series forecasting and regression modelling. Predictions are a lot of the times not made in a temporal context. E.g. if you predict aspects of the population based on your observations in a sample.

>> Only if you don't understand science. Correlation is not causation.

I might not understand science as well as you do, but I have some modicum of ability in statistics, so there's no need to wave "Correlation is not causation." at me in every reply you make.

Prediction are not always made as a result of a causation.