Semantic similarity and GO:BP predictions. Series of plots relating the semantic similarity (SS) for tenfold cross-validation to establishing a threshold for the prediction probability, tp. (a) An example illustrating the SS calculation. The nodes represent GO:BP terms, where the topmost node is the root. The red edges are 'is-a' and the blue, dashed edges are 'part-of' relationships in the ontology. Green nodes represent terms that are known and held-out for one gene, while the orange nodes are examples of predicted terms for the same gene. The half orange, half green node is an example where the predicted term perfectly matches a held-out term. The light blue nodes are the ancestor terms that fall within the path to the root, but are not annotated to either of the genes in this example. The SS of (a) is measured to be 0.45 through G-SESAME . (b) Also, SS = 0.45 is the median SS value when measured over all reported and annotated genetic interactions. With respect to the GO:BP predictions, SS was measured by comparing the set of predicted terms to the set of held-out terms. (c,d) The black color reflects predictions made from a network size of 20 K and the red color reflects predictions made from a network size of 200 K. (c) The proportion of genes at a given threshold tp that show a SS measure of > 0.45. (d) The number of predictions made for both integrated networks, and . The top plot in (d) shows the total number of genes with at least one prediction in relation to tp and the bottom bar graph shows the average number of GO:BP terms predicted per gene at a given tp.
Costello et al. Genome Biology 2009 10:R97 doi:10.1186/gb-2009-10-9-r97