Accounting for cellular heterogeneity is critical in epigenome-wide association studies
1 Lieber Institute for Brain Development, Johns Hopkins Medical Campus and Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA
2 Biostatistics and Computational Biology, Dana Farber Cancer Institute and Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
Genome Biology 2014, 15:R31 doi:10.1186/gb-2014-15-2-r31Published: 4 February 2014
Epigenome-wide association studies of human disease and other quantitative traits are becoming increasingly common. A series of papers reporting age-related changes in DNA methylation profiles in peripheral blood have already been published. However, blood is a heterogeneous collection of different cell types, each with a very different DNA methylation profile.
Using a statistical method that permits estimating the relative proportion of cell types from DNA methylation profiles, we examine data from five previously published studies, and find strong evidence of cell composition change across age in blood. We also demonstrate that, in these studies, cellular composition explains much of the observed variability in DNA methylation. Furthermore, we find high levels of confounding between age-related variability and cellular composition at the CpG level.
Our findings underscore the importance of considering cell composition variability in epigenetic studies based on whole blood and other heterogeneous tissue sources. We also provide software for estimating and exploring this composition confounding for the Illumina 450k microarray.