Statistical summary of IP enrichment strength. Each panel shows a summary statement of IP strength produced by CHANCE for a different sample. For each sample, CHANCE summary statements include: the statistical significance of IP enrichment, or the lack thereof; pie charts estimating the percentage of reads corresponding to DNA fragments pulled down by the antibody and the percentage of the genome enriched for biological signal; and, warning statements of possible bias or lack of sequencing depth. (a) The analysis results for H3K4me3 in human embryonic stem cells (HESCs; GEO GSM727572). Although this sample shows significant enrichment for signal, it also displays a possible lack of sufficient sequencing depth, which will result in a high false negative rate in peak detection. (b) The analysis results for H3K4me3 in mouse neural stem cells (NSCs). This sample shows decent enrichment, but CHANCE also detects an amplification bias in the input channel and alerts the user. (c) The results for the same sample as in (b) after bioinformatic de-duplication of reads. De-duplication has suppressed the amplification bias, recovering biological signal in the IP. The warning message has disappeared after de-duplication. (d) The summary statement for CARM1 in HESCs (GEO GSM801064). For this sample, the IP appeared extremely weak; CHANCE is unable to produce pie chart enrichment estimates as in the previous samples, but it nevertheless reports the false discovery rate (FDR) associated with the test for enrichment. There are four false discovery rates reported, each estimated on a separate subset of training data. Their abbreviations are as follows, HC: histone mark - cancer tissue; HN: histone mark - normal tissue; TC: transcription factor binding site - cancer tissue; TN: transcription factor binding site - normal tissue.
Diaz et al. Genome Biology 2012 13:R98 doi:10.1186/gb-2012-13-10-r98