This article is part of a special issue on epigenomics.
methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles
1 Department of Physiology and Biophysics, 1305 York Ave., Weill Cornell Medical College, New York, NY 10065, USA
2 The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, 1305 York Ave., Weill Cornell Medical College, New York, NY 10065, USA
3 Department of Public Health, Weill Cornell Medical College, 1300 York Ave., New York, NY 10065, USA
4 Department of Medicine, Division of Hematology/Oncology, 1300 York Ave., Weill Cornell Medical College, New York, NY 10065, USA
5 Department of Pathology, University of Michigan, 109 Zina Pitcher Place, Ann Arbor, MI 48109, USA
6 Department of Pharmacology, 1300 York Ave., Weill Cornell Medical College, New York, NY 10065, USA
Genome Biology 2012, 13:R87 doi:10.1186/gb-2012-13-10-r87Published: 3 October 2012
DNA methylation is a chemical modification of cytosine bases that is pivotal for gene regulation, cellular specification and cancer development. Here, we describe an R package, methylKit, that rapidly analyzes genome-wide cytosine epigenetic profiles from high-throughput methylation and hydroxymethylation sequencing experiments. methylKit includes functions for clustering, sample quality visualization, differential methylation analysis and annotation features, thus automating and simplifying many of the steps for discerning statistically significant bases or regions of DNA methylation. Finally, we demonstrate methylKit on breast cancer data, in which we find statistically significant regions of differential methylation and stratify tumor subtypes. methylKit is available at http://code.google.com/p/methylkit webcite.