Tuesday, October 7, 2014

Spurious Correlations and Dr. Alice Stewart's Triumph

Shoe fitting by x-ray (fluoroscope)

Who says statistics are dry and boring? Tyler Vigen, a Harvard Law student, recently created a fun website called Spurious Correlations. Tyler pokes fun by finding sets of numbers which, by random coincidence, are related. He notes, for instance, that the total number of US Political Action Committees is strongly related to the number of people who die by falling out of a wheelchair. Or that the number of people who have died from being tangled in their bedsheets tracks the total revenue generated by US ski facilities.

Amusing nonsense. And proof positive that correlation does not prove causation. In other words, that two sets of data are correlated does not mean that one causes the other.

The word comes to us from Middle French and means "related together." Two sets of measurements or data that track each other are correlated. Statistically, a relatively simple procedure can be applied to theses sets resulting in a relationship measurement (correlation coefficient) that ranges from -1 to +1. A coefficient of 0 means that the two sets of data are completely unrelated, i.e., none of the changes in one set can be explained by changes in the other. Coefficients of 1 show perfect correlation, the sets of data change in lockstep with each other. (-1 is a perfect inverse relationship, +1 a perfect direct relationship).

Ice cream cones purchased when compared to outdoor temperature forms a strong direct relationship: as temperatures increase, purchases of ice cream likewise increase. As men increase in age, the number of hairs on their head decreases; this is an inverse relationship and, unfortunately, is strongly correlated.

People are quick to tell you that correlation does not prove causation, usually when they disagree with the findings of some study. And they are correct; as Vigen demonstrates, coincidental correlations are not rare.

But as an informed citizen, you must be aware that correlation is a powerful data analysis tool used in fields ranging from health studies to astrophysics to economics. When two sets of numbers are correlated, statisticians do not jump to the conclusion that one causes the other, but rather are put on alert that deeper analysis is called for. Correlation is not proof, but it directs the investigation.

Such as the investigation performed by Dr. Alice Mary Stewart which ultimately saved millions of children the scourge of childhood cancer.

To set the stage, you must realize that radiation and x-rays, discovered in the 1890s by Marie Curie and Wilhelm Roentgen, were considered miracles of the modern age. The danger of exposure was not fully appreciated, and radiation was used in the twentieth century for such applications as illuminating watch dials, fitting shoes, and viewing a fetus in the womb. All were considered safe.

But Dr. Stewart was on a quest to explain a rash of lymphatic leukemia and other childhood cancers in England. Her own godchild died of the disease which drove her interest.

Dr. Stewart and her team conducted a study of 203 English public health hospitals during 1953-1955, looking for details of all children who had contracted cancer. The study included a questionnaire filled out by each mother.

From this mass of data, Dr. Stewart and her statistician, George Kneale, searched for patterns, looked for correlations. And they found one. It turns out that the children of mothers who had had a fetal x-ray were twice as likely to contract leukemia as those who had not had an x-ray. The finding was stunning, and rejected. After all, that radiation was safe was settled science. And Dr. Stewart was, after all (sniff), only a woman.

But she, and Kneale, persevered. Their working relationship was interesting. Whenever Stewart proposed a conclusion, Kneale would apply all of his substantial statistical skill to prove her wrong. This was their method and it made her results stronger as she defended them against stout attacks.

It took nearly twenty years, but Stewart and Kneale amassed a growing dataset of 22,000 childhood cancer victims in which the use of pre-natal x-rays was increasingly complicit. Finally, in the late 1970s, the American and British medical societies accepted Dr. Stewart's findings and recommended that pre-natal x-rays not be routinely performed.

Two lessons here. First, correlation does not prove causation, but it is often a smoking gun begging for deeper analysis.  And the other - share your conclusions and data, welcome challenges, and defend them with logic and reason. Do not dismiss another's theory just because she doesn't share your prejudices.

And now, dear citizen, you know more of correlation than most do.

No comments:

Post a Comment