Experimental annotation of the human pathogen Candida albicans coding and noncoding transcribed regions using high-resolution tiling arrays
1 Biotechnology Research Institute, National Research Council of Canada, 6100 Royalmount, Montréal, Québec, H4P 2R2, Canada
2 Department of Anatomy and Cell Biology, McGill University, 3640 University Street, Montréal, Québec, H3A 1B1, Canada
3 Department of Biology, McGill University, 1205 Docteur Penfield, Montréal, Québec, H3A 1B1, Canada
4 Intracellular Signaling Laboratory, Institute of Research in Immunology and Cancer (IRIC), University of Montreal, 2900 boulevard Édouard-Montpetit, Montreal, Quebec, H3C 3J7, Canada
5 Department of Molecular Biology and Microbiology, Tufts University, 136 Harrison Avenue, Boston, MA 02111, USA
Genome Biology 2010, 11:R71 doi:10.1186/gb-2010-11-7-r71Published: 9 July 2010
Compared to other model organisms and despite the clinical relevance of the pathogenic yeast Candida albicans, no comprehensive analysis has been done to provide experimental support of its in silico-based genome annotation.
We have undertaken a genome-wide experimental annotation to accurately uncover the transcriptional landscape of the pathogenic yeast C. albicans using strand-specific high-density tiling arrays. RNAs were purified from cells growing under conditions relevant to C. albicans pathogenicity, including biofilm, lab-grown yeast and serum-induced hyphae, as well as cells isolated from the mouse caecum. This work provides a genome-wide experimental validation for a large number of predicted ORFs for which transcription had not been detected by other approaches. Additionally, we identified more than 2,000 novel transcriptional segments, including new ORFs and exons, non-coding RNAs (ncRNAs) as well as convincing cases of antisense gene transcription. We also characterized the 5' and 3' UTRs of expressed ORFs, and established that genes with long 5' UTRs are significantly enriched in regulatory functions controlling filamentous growth. Furthermore, we found that genomic regions adjacent to telomeres harbor a cluster of expressed ncRNAs. To validate and confirm new ncRNA candidates, we adapted an iterative strategy combining both genome-wide occupancy of the different subunits of RNA polymerases I, II and III and expression data. This comprehensive approach allowed the identification of different families of ncRNAs.
In summary, we provide a comprehensive expression atlas that covers relevant C. albicans pathogenic developmental stages in addition to the discovery of new ORF and non-coding genetic elements.