TrxG/PcG binding profiles are predictive for gene regulatory states. A sparse Gaussian Mixture Model (sGMM) emitting class assignment (1, 2, and 3) was fitted to ChIP-seq enrichment data for ASH1, FSH, TRX-C, PC, PH, and PSC. Normalized enrichments relative to Input were calculated for the 1-kb window centered at known TSS. (A) Pairs plot of the fitted model in selected dimensions; each plotting symbol represents a TSS window (class 1 = circle, class 2 = triangle, class 3 = plus sign); color-coding of plotting symbols indicates corresponding gene expression level (see color key for details). Numbered ellipses denote mean and variance of each class in the plotted dimensions. (B) Class 1 genes are characterized by the absence of TrxG/PcG proteins at the promoter. Class 2 and class 3 genes are cobound by ASH1 and FSH, in addition class 3 genes display strong PRC1 and TRX-C enrichments. Heat map plotting class means (columns) over all model dimensions. Row clustering indicates proteins showing correlated changes between classes. (C) Class 1 and class2 genes are mostly inactive, while class 2 genes get transcribed. Boxplots display distribution of gene expression within each sGMM class. Graphs show density estimates for gene expression and H3K4me3 modification (1-kb window) within each sGMM class compared to all genes (total). (D) Class 2 signal density and mRNA signal density show strong spatial correlation across chromosomes. Plot shows kernel density estimations for class 2 predictions and discretized mRNA signal (log2 FPKM cut-off = 2.5) in 1-kb windows tiling chromosomes.
Kockmann et al. Genome Biology 2013 14:R18 doi:10.1186/gb-2013-14-2-r18