This article has not been peer reviewed.Deposited research article
All motifs are NOT created equal: structural properties of transcription factor-DNA interactions and the inference of sequence specificity
Center for Integrative Genomics, Division of Genetics and Development, Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, USA. Department of Genome Sciences, Genomics Division, Ernest Orlando, Lawrence Berkeley National Lab, Berkeley, USA
Genome Biology 2005, 6:P7 doi:10.1186/gb-2005-6-5-p7
This was the first version of this article to be made available publicly.Published: 31 March 2005
The identification of transcription factor binding sites in genome sequences is an important problem in contemporary sequence analysis, and a plethora of approaches to the problem have been proposed, implemented and evaluated in recent years. Although the biological and statistical models, descriptions of binding sites and computational algorithms used vary considerably amongst these methods, most share a common assumption – that all motifs are equally likely to be transcription factor binding sites. Here we argue that this simplifying assumption is incorrect – that the specific nature of transcription factor-DNA interactions imposes constraints on the types of motifs that are likely to be transcription factor binding sites and on the relationships between motifs recognized by members of structurally similar transcription factors. We propose that our structural and biochemical understanding of the interactions between transcription factors and DNA can be used to guide de novo motif detection methods, and, in a series of related papers introduce several methods that incorporate this idea.