A confounding variable is a variable which in some way biases or distorts the result of a statistical test, comparison, or model. Commonly this occurs when a relationship is found between two variables which is in fact due (in whole or in part) to the effect of a third, unmeasured or excluded variable. As a simple example, suppose one found a positive relationship between ice cream sales and deaths by drowning. One might then conclude that in some way ice cream consumption contributed to unsafe swimming practices. In fact, however, it is likely that any such association would be spurious, the result of the confounding variable of temperature. On hot days, more people buy ice creams, and also more people go swimming and hence are at risk of drowning. In order to test the original hypothesis that drowning and ice cream deaths are causally connected, therefore, one would need to account for (or ‘control’) the confounding factor of temperature (perhaps by comparing ice cream sales and drowning on days of similar temperature).
Confounding variables are exceptionally common in medical or health related studies, as well as in any political or economic data. Some illustrative examples include:
- Associations between consumption of a certain food product and better or worse health outcomes. It is quite likely that people who consume different foods also have different degrees of access to health care, different exercise regimes, different smoking and drinking habits, are of different ages, etc. All of these may act as confounding variables in any association of diet and health outcomes.
- Findings that certain countries or regions of a country have better or worse social outcomes than others. Different countries inevitably differ in numerous ways, including the climate, population demographics, cultural norms, income levels, degree of political stability, religion, education levels, workforce composition, and many more. Any of these factors could contribute to differences in health, educational, happiness, environmental, and other factors that tend to be compared. As such it is quite difficult to make claims about whether one countries policies in some area are actually better than those of another country, given the large number of potential confounding variables.
- Findings that people of a particular race, religion, ethnicity, or other identifiable group have better or worse health, employment, happiness, or other such outcomes compared to another group. Such differences are often the result not of race or religion etc directly, but rather are caused by confounding factors such as age differences in different communities, generational differences, differences in income levels, etc.
Confounding variables can be dealt with either by explicitly including them in a statistical analysis, or by conducting a randomised trial.
What is a confounding variable?: a simple explanation from a psychology blog
Confounding variable/third variable: short discussion of the problems posed by confounding variables and some ways to avoid them
Extraneous and confounding variables and systematic vs non-systematic error: short section on confounding variables and their relation to systematic errors