Genome-wide discovery and characterization of maize long non-coding RNAs
1 Department of Agronomy and Plant Genetics, University of Minnesota, Saint Paul, MN 55108, USA
2 Department of Plant Biology, University of Minnesota, Saint Paul, MN 55108, USA
3 Department of Plant Biology, Cornell University, Ithaca, NY 14853, USA
4 Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
5 Department Agronomy, Iowa State University, Ames, IA 50011, USA
6 Center for Plant Genomics, Iowa State University, Ames, IA 50011-3650, USA
7 Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
8 Informatics Research Core Facility, University of Missouri, Columbia, MO 65211, USA
9 Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
10 Current address: Pioneer Hi-Bred, Johnston, IA 50131, USA
Genome Biology 2014, 15:R40 doi:10.1186/gb-2014-15-2-r40Published: 27 February 2014
Long non-coding RNAs (lncRNAs) are transcripts that are 200 bp or longer, do not encode proteins, and potentially play important roles in eukaryotic gene regulation. However, the number, characteristics and expression inheritance pattern of lncRNAs in maize are still largely unknown.
By exploiting available public EST databases, maize whole genome sequence annotation and RNA-seq datasets from 30 different experiments, we identified 20,163 putative lncRNAs. Of these lncRNAs, more than 90% are predicted to be the precursors of small RNAs, while 1,704 are considered to be high-confidence lncRNAs. High confidence lncRNAs have an average transcript length of 463 bp and genes encoding them contain fewer exons than annotated genes. By analyzing the expression pattern of these lncRNAs in 13 distinct tissues and 105 maize recombinant inbred lines, we show that more than 50% of the high confidence lncRNAs are expressed in a tissue-specific manner, a result that is supported by epigenetic marks. Intriguingly, the inheritance of lncRNA expression patterns in 105 recombinant inbred lines reveals apparent transgressive segregation, and maize lncRNAs are less affected by cis- than by trans-genetic factors.
We integrate all available transcriptomic datasets to identify a comprehensive set of maize lncRNAs, provide a unique annotation resource of the maize genome and a genome-wide characterization of maize lncRNAs, and explore the genetic control of their expression using expression quantitative trait locus mapping.