Cross-species analysis of the predictive model for genetic interactions. (a) Pearson correlations between one-to-one S. cerevisiae and S. pombe orthologs for their values of gene features. Note that a number of features are sequence-based and are thus not independent of the sequence-based ortholog identification; features that appear to have trivial correlations are not included here. Error bars show 95% confidence intervals. (b) Pearson correlations between features and degree in S. pombe are observed to be significant in many cases and similar to those in S. cerevisiae. A complete set of features and their correlations is given in Table 1; see Materials and methods for descriptions of gene features. Error bars show 95% confidence intervals. (c) Predictive abilities of bagged regression tree models were evaluated by measuring Pearson correlations between predicted and actual degrees. The left set of bars shows the performance of predictions made for approximately 550 S. pombe genes and the right set of bars shows the performance of predictions made for all non-essential deletion mutants in S. cerevisiae. For each scenario, models were trained both on data from the same species (red bar) as well as data from the other species (blue bars). The light blue bars correspond to predicting degrees of all genes in the test species, while the dark blue bars correspond to predicting degrees of genes in the subset of genes lacking orthologs in the training species. Error bars show standard deviations of bootstrapped predictions. For a baseline, the dashed line shows the correlation between observed degrees of one-to-one orthologous genes (a simple prediction method that can be applied to only orthologs). CAI, Codon Adaptation Index; PPI, protein-protein interaction; SM, single mutant.
Koch et al. Genome Biology 2012 13:R57 doi:10.1186/gb-2012-13-7-r57