This article is part of the supplement: Quantitative inference of gene function from diverse large-scale datasets

Open Access Open Badges Method

Combining guilt-by-association and guilt-by-profiling to predict Saccharomyces cerevisiae gene function

Weidong Tian1, Lan V Zhang14, Murat Taşan1, Francis D Gibbons15, Oliver D King16, Julie Park2, Zeba Wunderlich17, J Michael Cherry2 and Frederick P Roth13*

Author Affiliations

1 Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Longwood Avenue, Boston, Massachusetts 02115, USA

2 Department of Genetics, School of Medicine, Stanford University, Stanford, California 94305-5120, USA

3 Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Jimmy Fund Way, Boston, Massachusetts 02115, USA

4 McKinsey and Company, Hansen Way, Palo Alto, California 94304, USA

5 Merrimack Pharmaceuticals, Kendall Square, Cambridge, Massachusetts 02139, USA

6 Boston Biomedical Research Institute (BBRI), Grove St., Watertown, Massachusetts 02472, USA

7 Massachusetts Institute of Technology, Massachusetts Ave, Cambridge, Massachusetts 02139, USA

For all author emails, please log on.

Genome Biology 2008, 9(Suppl 1):S7  doi:10.1186/gb-2008-9-s1-s7

Published: 27 June 2008



Learning the function of genes is a major goal of computational genomics. Methods for inferring gene function have typically fallen into two categories: 'guilt-by-profiling', which exploits correlation between function and other gene characteristics; and 'guilt-by-association', which transfers function from one gene to another via biological relationships.


We have developed a strategy ('Funckenstein') that performs guilt-by-profiling and guilt-by-association and combines the results. Using a benchmark set of functional categories and input data for protein-coding genes in Saccharomyces cerevisiae, Funckenstein was compared with a previous combined strategy. Subsequently, we applied Funckenstein to 2,455 Gene Ontology terms. In the process, we developed 2,455 guilt-by-profiling classifiers based on 8,848 gene characteristics and 12 functional linkage graphs based on 23 biological relationships.


Funckenstein outperforms a previous combined strategy using a common benchmark dataset. The combination of 'guilt-by-profiling' and 'guilt-by-association' gave significant improvement over the component classifiers, showing the greatest synergy for the most specific functions. Performance was evaluated by cross-validation and by literature examination of the top-scoring novel predictions. These quantitative predictions should help prioritize experimental study of yeast gene functions.