Transcriptional slippage in bacteria: distribution in sequenced genomes and utilization in IS element gene expression
1 Department of Human Genetics, University of Utah, Salt Lake City, UT 84112-5330, USA
2 Bioscience Institute, University College Cork, Cork, Ireland
3 Current address: Gene Technology Division, Nitto Denko Technical Corporation, 401 Jones Road, Oceanside, CA 92054, USA
Genome Biology 2005, 6:R25 doi:10.1186/gb-2005-6-3-r25Published: 15 February 2005
Transcription slippage occurs on certain patterns of repeat mononucleotides, resulting in synthesis of a heterogeneous population of mRNAs. Individual mRNA molecules within this population differ in the number of nucleotides they contain that are not specified by the template. When transcriptional slippage occurs in a coding sequence, translation of the resulting mRNAs yields more than one protein product. Except where the products of the resulting mRNAs have distinct functions, transcription slippage occurring in a coding region is expected to be disadvantageous. This probably leads to selection against most slippage-prone sequences in coding regions.
To find a length at which such selection is evident, we analyzed the distribution of repetitive runs of A and T of different lengths in 108 bacterial genomes. This length varies significantly among different bacteria, but in a large proportion of available genomes corresponds to nine nucleotides. Comparative sequence analysis of these genomes was used to identify occurrences of 9A and 9T transcriptional slippage-prone sequences used for gene expression.
IS element genes are the largest group found to exploit this phenomenon. A number of genes with disrupted open reading frames (ORFs) have slippage-prone sequences at which transcriptional slippage would result in uninterrupted ORF restoration at the mRNA level. The ability of such genes to encode functional full-length protein products brings into question their annotation as pseudogenes and in these cases is pertinent to the significance of the term 'authentic frameshift' frequently assigned to such genes.