Reinvestigation of the Saccharomyces cerevisiae genome annotation by comparison to the genome of a related fungus: Ashbya gossypii
1 Institute of Applied Microbiology, Biozentrum der Universität Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
2 Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710-3568, USA
3 Syngenta, Research Triangle Park, NC 27709, USA
Genome Biology 2003, 4:R45 doi:10.1186/gb-2003-4-7-r45Published: 25 June 2003
The recently sequenced genome of the filamentous fungus Ashbya gossypii revealed remarkable similarities to that of the budding yeast Saccharomyces cerevisiae both at the level of homology and synteny (conservation of gene order). Thus, it became possible to reinvestigate the S. cerevisiae genome in the syntenic regions leading to an improved annotation.
We have identified 23 novel S. cerevisiae open reading frames (ORFs) as syntenic homologs of A. gossypii genes; for all but one, homologs are present in other eukaryotes including humans. Other comparisons identified 13 overlooked introns and suggested 69 potential sequence corrections resulting in ORF extensions or ORF fusions with improved homology to the syntenic A. gossypii homologs. Of the proposed corrections, 25 were tested and confirmed by resequencing. In addition, homologs of nearly 1,000 S. cerevisiae ORFs, presently annotated as hypothetical, were found in A. gossypii at syntenic positions and can therefore be considered as authentic genes. Finally, we suggest that over 400 S. cerevisiae ORFs that overlap other ORFs in S. cerevisiae and for which no homolog can be detected in A. gossypii should be regarded as spurious.
Although, the S. cerevisiae genome is rightly considered as one of the most accurately sequenced and annotated eukaryotic genomes, we have shown that it still benefits substantially from comparison to the completed sequence and syntenic gene map of A. gossypii, an evolutionarily related fungus. This type of approach will strongly support the annotation of more complex genomes such as the human and murine genomes.