Gene Prediction Accuracy for each ENCODE sequence at the nucleotide and exon levels. Boxplots showing the average sensitivity and specificity at the (a) nucleotide level and (b) exon level for CDS evaluation of each program on every sequence of the test set. Sequences are displayed across the x-axes. Manual picks are shown in in light blue; random picks are shown in orange. Boxplots corresponding to the overall average sensitivity and specificity at the nucleotide level for CDS evaluation in different subsets of the ENCODE sequences are shown at the right of the graph. EN_TRN13, the set of 13 training regions, and EN_PRD31, the set of 31 test regions, are shown in green. EN_MNLp12, the 12 manual picks in the test set, and EN_RNDp19, the 19 random picks in the test set are shown in dark blue. EN_PGH12/EN_PGM11/EN_PGL8, the subsets of 12 high, 11 medium and 8 low gene dense sequences from the set of test sequences, are shown in yellow. EN_PMH7/EN_PMM5/EN_PML7, the subsets of seven regions with high sequence conservation with mouse, five regions with medium conservation, and seven regions with low conservation from random picks in the test set, are shown in red.
Guigó et al. Genome Biology 2006 7(Suppl 1):S2 doi:10.1186/gb-2006-7-s1-s2