Coherence of types of data and datasets on individual KEGG pathways. Examples of how types of data and individual datasets compare to the fully integrated network as measured through coherence of KEGG pathways . The average coherence of a given dataset is calculated for a set of genes defined by a KEGG pathway at increasing network sizes up to one million edges. (a) The average coherence over 63 tested KEGG pathways. The full integration of genetic interactions, protein interactions, and microarray data performs best compared to all other data sources and individual datasets. (b) A specific example where the fully integrated network performs better than all other individual datasets and in relation to the 'purine metabolism' KEGG pathways. (c) Ribosomal constituents are highly coherent in the microarray data, with many individual microarray datasets performing well. In this instance, not taking into account the genetic interactions and protein interactions performs better than the fully integrated network. (d) An example of where the genetic interactions and protein interactions contribute nearly all of the coherent relationships for the 'Hedgehog signaling' KEGG pathway. (e) An example of where the integration method performs worse than several individual microarray datasets for the 'phenylpropanoid biosynthesis' KEGG pathway. See Table 1 for citations for the datasets.
Costello et al. Genome Biology 2009 10:R97 doi:10.1186/gb-2009-10-9-r97