Transcriptional and structural impact of TATA-initiation site spacing in mammalian core promoters
1 Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, 230-0045, Japan
2 MRC Functional Genetics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3QX, UK
3 Computational Biology Unit, Bergen Center for Computational Science, University of Bergen, HIB, Thormøhlensgate 55, N-5008 Bergen, Norway
4 Genome Science Laboratory, Discovery and Research Institute, RIKEN Wako Institute, Wako, Saitama, 351-0198, Japan
Genome Biology 2006, 7:R78 doi:10.1186/gb-2006-7-8-r78Published: 17 August 2006
The TATA box, one of the most well studied core promoter elements, is associated with induced, context-specific expression. The lack of precise transcription start site (TSS) locations linked with expression information has impeded genome-wide characterization of the interaction between TATA and the pre-initiation complex.
Using a comprehensive set of 5.66 × 106 sequenced 5' cDNA ends from diverse tissues mapped to the mouse genome, we found that the TATA-TSS distance is correlated with the tissue specificity of the downstream transcript. To achieve tissue-specific regulation, the TATA box position relative to the TSS is constrained to a narrow window (-32 to -29), where positions -31 and -30 are the optimal positions for achieving high tissue specificity. Slightly larger spacings can be accommodated only when there is no optimally spaced initiation signal; in contrast, the TATA box like motifs found downstream of position -28 are generally nonfunctional. The strength of the TATA binding protein-DNA interaction plays a subordinate role to spacing in terms of tissue specificity. Furthermore, promoters with different TATA-TSS spacings have distinct features in terms of consensus sequence around the initiation site and distribution of alternative TSSs. Unexpectedly, promoters that have two dominant, consecutive TSSs are TATA depleted and have a novel GGG initiation site consensus.
In this report we present the most comprehensive characterization of TATA-TSS spacing and functionality to date. The coupling of spacing to tissue specificity at the transcriptome level provides important clues as to the function of core promoters and the choice of TSS by the pre-initiation complex.