Fallacies of the Correlation Coefficient

The correlation coefficient is the cause of many fallacies.

Plots of one quantity against another are plotted if possible, with the independent quantity on the x axis, and the dependent quantity on the y axis. This causes confusion if the quantities cannot be identified as dependent or independent. A high correlation coefficient between two quantities does not imply that one causes the other. Good students tend to score high marks across many subjects, so that high scores in maths tend to be correlated with high scores in English. This does not mean that a high score in maths is the cause of a high score in English, or vice versa.

A high correlation coefficient does not imply a linear correlation and a low correlation coefficient does not imply that no relationship exists. A plot for two quantities may lie on a perfect circle but return a low correlation coefficient.

Two quantities may not be directly related, but both may be related with some third variable, and hence, or not, with each other. Life expectancy tends to increase with time in western countries, and so does wealth and the incidence of diabetes. There is an understood positive correlation between average wealth and life expectancy – rich countries can afford better health systems, education systems, better quality of food. All these promote long lives. There is also a correlation between average wealth and diabetes – western diets, high in fat and sugars tends to lead to diabetes - but this does not mean that to live long, you should catch diabetes.

Spurious correlations may also exist. The number of deaths per day tends to increase if the population is increasing, so that time an the dail,y death toll are positively correlated, but this does not mean that people are becoming unhealthier, or that some action needs to be taking, only if, that if there are more people, there will also be more dead people.

The correlation coefficient only tests a relationship between two variables. It is in fact, often impossible two isolate two factors from all others. If we wanted to test if wealth and life expectancy were correlated, we would find it hard to separate these two factors from the influence of pollution, education, sex, war.

Add comment

Security code