Genome-wide analysis of mRNA lengths in Saccharomyces cerevisiae
1 Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305-5307, USA
2 Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA 94305-5428, USA
Genome Biology 2003, 5:R2 doi:Published: 22 December 2003
Although the protein-coding sequences in the Saccharomyces cerevisiae genome have been studied and annotated extensively, much less is known about the extent and characteristics of the untranslated regions of yeast mRNAs.
We developed a 'Virtual Northern' method, using DNA microarrays for genome-wide systematic analysis of mRNA lengths. We used this method to measure mRNAs corresponding to 84% of the annotated open reading frames (ORFs) in the S. cerevisiae genome, with high precision and accuracy (measurement errors ± 6-7%). We found a close linear relationship between mRNA lengths and the lengths of known or predicted translated sequences; mRNAs were typically around 300 nucleotides longer than the translated sequences. Analysis of genes deviating from that relationship identified ORFs with annotation errors, ORFs that appear not to be bona fide genes, and potentially novel genes. Interestingly, we found that systematic differences in the total length of the untranslated sequences in mRNAs were related to the functions of the encoded proteins.
The Virtual Northern method provides a practical and efficient method for genome-scale analysis of transcript lengths. Approximately 12-15% of the yeast genome is represented in untranslated sequences of mRNAs. A systematic relationship between the lengths of the untranslated regions in yeast mRNAs and the functions of the proteins they encode may point to an important regulatory role for these sequences.