Combined analysis reveals a core set of cycling genes
1 Department of Computer Science, Carnegie Mellon University, Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA
2 Department of Computational Biology, University of Pittsburgh Medical School, Lothrop Street, Pittsburgh, Pennsylvania 15213, USA
3 Machine Learning Department, Carnegie Mellon University, Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA
4 Department of Molecular Biology, Hebrew University Medical School, Jerusalem, Israel 91120
5 Basic Sciences Division, Fred Hutchinson Cancer Center, Fairview Avenue N, Seattle, Washington 98109, USA
Genome Biology 2007, 8:R146 doi:10.1186/gb-2007-8-7-r146Published: 24 July 2007
Global transcript levels throughout the cell cycle have been characterized using microarrays in several species. Early analysis of these experiments focused on individual species. More recently, a number of studies have concluded that a surprisingly small number of genes conserved in two or more species are periodically transcribed in these species. Combining and comparing data from multiple species is challenging because of noise in expression data, the different synchronization and scoring methods used, and the need to determine an accurate set of homologs.
To solve these problems, we developed and applied a new algorithm to analyze expression data from multiple species simultaneously. Unlike previous studies, we find that more than 20% of cycling genes in budding yeast have cycling homologs in fission yeast and 5% to 7% of cycling genes in each of four species have cycling homologs in all other species. These conserved cycling genes display much stronger cell cycle characteristics in several complementary high throughput datasets.
Essentiality analysis for yeast and human genes confirms these findings. Motif analysis indicates conservation in the corresponding regulatory mechanisms. Gene Ontology analysis and analysis of the genes in the conserved sets sheds light on the evolution of specific subfunctions within the cell cycle.
Our results indicate that the conservation in cyclic expression patterns is much greater than was previously thought. These genes are highly enriched for most cell cycle categories, and a large percentage of them are essential, supporting our claim that cross-species analysis can identify the core set of cycling genes.