Monday, December 3, 2012

Logical Fallacies: Correlation/Causation


The act of understanding this fallacy depends on understanding the concept of correlation, so first I have to take some time to explain that.  Variable A and variable B are two numbers or characteristics that can be varied.  If changes in one coincide with changes in the other a high percentage of the time, this establishes a correlation.  If increases or decreases in one coincide with the same in the other, this is a positive correlation.  If, on the other hand, each coincides with the opposite in the other, this is a negative correlation.
            On the Earth’s surface, the further one is from the equator, the lower the temperature.  Therefore, distance from the equator and temperature are negatively correlated.  The higher the education level of the people surveyed, given a proper sample, the smaller the religious percentage of that group of people.  Therefore, education level and religiosity are also negatively correlated.
            Education level and unemployment rate are also negatively correlated, while education level and median income are positively correlated.
            With any given planet or moon, the more mass it has, the heavier a given object will be on its surface.  Thus, mass and gravitation are positively correlated.  Planets and moons with more mass and thus more gravitation tend to be more spherical, because their greater gravitation limits just how far geological structures can protrude before being pulled back in by the force of their own weight.  Mars, having less gravitation than the Earth, has structures protruding further.  Its mountains are significantly higher and its valleys are significantly deeper.  So on the surface of a planet (or at least a rock planet), gravitation and average mountain height are negatively correlated.
            The correlation is positive, though, when it comes to gravitation and atmosphere thickness.  Mars, having much less gravitation than the Earth, has a thinner atmosphere.  In fact, its atmosphere is so much thinner and its mountains extend so much higher that the peak of its highest mountain, Pavonis Mons, actually extends just above its atmosphere the way an island extends just above the surface of a body of water.  Indeed, I often wonder if that might be a good spot to land a rover.  But I digress.
            Positive correlations:
            The higher A, the higher B.  The lower A, the lower B.
            Where A is, B usually is as well, and where A is not, B usually is not either.  Presence of A tends to coincide with presence of B and absence of A tends to coincide with absence of B.
            Negative correlations:
            The higher A, the lower B.  The lower A, the higher B.
            Where A is, B usually is not, and vice versa.
            But observe that, so far, I have yet to utter a single word about causation, and there’s a reason for that.  You see, when a correlation has been identified between A and B, there are three possible explanations for it.  It could be that the changes in A are causing the changes in B, it could be that the changes in B are causing the changes in A, or it could be that the changes in both are being caused by something else entirely; an extraneous variable; that is, a variable that the study in question just did not happen to account for.
            Statistically speaking, one in fifteen women is going to attempt suicide at one time or another.  If, on the other hand, we narrow the sample specifically to women who have gotten breast implants, it becomes one in five.  Women with breast implants are three times as likely to attempt suicide.  That is a positive correlation.
            One way to interpret this is to suggest that the change in variable A causes the change in variable B; that somehow, the act of getting the implants makes these women more likely to contemplate suicide, but personally, I find it more plausible to suggest an extraneous variable.  I find it more likely that each change is being caused by another factor that the study in question just did not happen to account for; an extraneous variable.  The factor I suspect is low self esteem.  I find it much more likely that low self esteem compels women both to get implants and to contemplate suicide.  The woman comes under the impression that this particular surgical augmentation is going to make her feel better about herself, and for a little while, it does.  Then her spirits drop right back to where they were before.  This drastic act did not have lasting benefits because it focused on the symptom, not the underlying problem, but again, I digress.
            Now this is a problem I had with Thunderf00t shortly before I unsubbed from him.  In one video, I don’t remember which, he pointed out (assuming that this is true) that a disproportionately small percentage of the world’s scientific breakthroughs come from the parts of the world in which Islam is the predominant religion.  He tried to use this to argue that there is something about Islam particularly intrinsically antithetical to scientific progress; more so than other forms of dogmatism.  I don’t remember who, but someone called him on that in the comments, explaining that correlation does not establish causation.
            Tf00t responded by pointing out how absurd it is to suggest that the higher rate of cancer among smokers is not caused by tobacco use.
            I then called him on the strawman.  What this guy said is not that correlation negates causation, but that if fails to establish it.  Do we know about the causal connection between tobacco use and cancer?  Of course.  Do we know about it just from the correlation?  No.
            Remember, when a correlation has been observed between variable A and variable B, there are three possible explanations: the changes in A could be causing the changes in B, the changes in B could be causing the changes in A, or the changes in both could be caused by something else entirely; an extraneous variable.
            Now given the previous comprehension of science Tf00t had exhibited, and given the fact that, in any field of science, this much is covered in the first semester, he does not really have any excuse for not understanding this.
            Consider that most of the people in the world for whom rice is a staple food have dark hair.  Does this mean that looking in the mirror and seeing straight, dark hair somehow makes one more likely to want rice?  Does this mean that there is something in rice that turns one’s hair dark?  I suppose these are both possibilities, but I, for one, don’t find either especially likely.  No, much more likely, it’s because people for whom rice is a staple food are usually from the orient, where dark hair is the norm.  The extraneous variable is the state of being from the orient.
            When tobacco companies were confronted with this correlation between tobacco use and the occurrence of cancer, they insisted that it was probably just because the sort of person who was already more likely to develop cancer was also more likely to take up tobacco use.  In other words, they insisted on an extraneous variable.
            So what is the scientific response to a correlation?  To test these different possibilities.  If one sets up a controlled experiment that enables one to manipulate A and observe B, and the changes one makes to A coincide with the very changes in B that the correlation suggests, then this demonstrates that A causes B.  If they don’t, then this possible explanation is ruled out.  One then needs to set up an experiment that enables one to manipulate B and observe A.  If this produces the predicted changes, then this explanation is proven, but otherwise, this possible explanation is also ruled out, and the only explanation remaining is the extraneous variable.
            In response to this claim by the tobacco companies, the scientific community took a collection of mice and sorted them randomly into two groups; an experimental group and a control group.  Let me emphasize that.  They had to be very careful to sort them randomly to ensure the accuracy of the test results.  Then they kept the two groups of mice living under environmental conditions that were as close to identical as they could get them, to make sure that any subsequent changes could not be attributed to any other environmental differences.  Then they gave the mice in the experimental group regular exposure to tobacco but not those in the control group, and monitored both.  The tobacco exposure had to be the only significant difference between the two environments.
            I don’t remember whether it was just a few months, or a few years, but over this time, only the mice in the experimental group (the ones who had been exposed to tobacco) developed cancer.  This was the proverbial smoking gun (no pun intended).  Here was the irrefutable proof.  Confronted with this, the tobacco companies could only respond by doing everything in their power to persuade the general public to either overlook these test results, or deliberately ignore them.
            Correlation is only one ingredient in establishing causation.  True, it is an essential ingredient, but far from the only one.  To get from correlation to causation, testing at least two of the three possibilities is necessary.  Ideally, this is done with a controlled experiment, but sometimes, such is prevented by practical or ethical limitations.  In this case, all one can do is ask oneself, in each case, “What evidence should exist if this is true?  What evidence should exist if it isn’t?” or more concisely, “How is it verifiable if true?  How is it falsifiable if false?” and then look for evidence from both lists.
            Now be careful here.  You have to have answers for both (and the more, the better) before you begin your investigation or you are in danger of confirmation bias and nonfalsifiability which are both fallacies I explain earlier in the playlist.

No comments:

Post a Comment