Average number of motifs and matches at 15% FDR recovered by the continuous and discrete mutual-information scoring functions. The same optimization and filtering procedure was applied on both versions (seed FDR < 0.001, α = 0.75, γ = 0.75). (Left) Average results for the 24 yeast datasets. The leftmost column shows the results obtained by the continuous version of the scoring function implemented in RED2. The 'k = i' columns show the results obtained by the discrete version and i clusters. The 'best k' column shows the results obtained when the number of clusters that yields the highest number of motifs is selected a posteriori for each dataset. (Right) Number of predicted motifs that match a known motif in the ScerTF database for the 24 yeast datasets (FDR < 15%). Each point corresponds to one of the 24 datasets. The y-axis corresponds to the number achieved by RED2 and the x-axis to the number achieved by the discrete version and the best k procedure. Superimposed points are indicated by shading. RED2 found more motifs than the best clustering for 18 datasets and fewer for two datasets, which gives a sign test P value of 0.0004.
Lajoie et al. Genome Biology 2012 13:R109 doi:10.1186/gb-2012-13-11-r109