Would we trust an insurance provider who sets motorbike insurance rates based on the sales of sour cream? Or would we schedule our space launches according to the number of doctoral degrees awarded in Sociology?
Probably all of us would agree that this kind of decision-making is unjustified. A specific decision like this appears to be only superficially supported by the evidence of correlations between those various factors, but is there more to the story? Does it go any deeper? What if there exists a hidden causal factor that induces the apparently spurious correlation?
For example, suppose the increase in space launches and the increase in doctoral degrees in Sociology were both related to an increase in government investments in research studies on the sociological impacts of establishing a permanent human colony on the Moon. This case reveals a hidden causal connection in an otherwise strange correlation. The explanatory variable (which is a hidden confounding factor) is the research investment, and the response variables are the space launches and doctoral degrees.
What about other cases? What about the evidence that sour cream sales correlate with motorbike accidents? In such cases, shouldn’t we all be pleased to see organizations making evidence-based data-driven objective decisions (especially in this brave new world of exploding data volumes and ubiquitous analytics)? No, I don’t think so!!
So, what kind of world is this?
Welcome to the world of explanatory variables and confounding factors!
Statistical literacy is needed now more than ever (to paraphrase H. G. Wells). This includes awareness of and adherence to common principles of statistical reasoning. For example…
Follow Kirk Borne on Twitter @KirkDBorne