Heterogeneity

Heterogeneity is a problem that occurs when statistical analyses attempt to pool data from different sources or samples. This can occur for example when comparing outcomes in different experiments, data collected in different years, results obtained from different physical locations, etc. It can result in different subsets of the resulting dataset possessing different statistical properties, which can bias statistical results and lead to incorrect inferences. As a result of such problems, care must always be taken when combining data from different sources, and one must be wary of studies which make comparisons on the basis of data obtained in this way.

Further Reading

Data homogeneity: brief explanation with examples from climate science

Homogeneity, homogeneous data & homogeneous sampling: a short but clear introduction

What is heterogeneity and why is it important?: detailed discussion from a British Medical Journal article