<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2009-10-5-r46</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Software</dochead>
      <bibl>
         <title>
            <p>MotifAdjuster: a tool for computational reassessment of transcription factor binding site annotations</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Keilwagen</snm>
               <fnm>Jens</fnm>
               <insr iid="I1"/>
               <email>Jens.Keilwagen@ipk-gatersleben.de</email>
            </au>
            <au id="A2">
               <snm>Baumbach</snm>
               <fnm>Jan</fnm>
               <insr iid="I2"/>
               <email>jbaumbac@icsi.berkeley.edu</email>
            </au>
            <au id="A3">
               <snm>Kohl</snm>
               <mi>A</mi>
               <fnm>Thomas</fnm>
               <insr iid="I3"/>
               <insr iid="I4"/>
               <email>tkohl@cebitec.uni-bielefeld.de</email>
            </au>
            <au id="A4">
               <snm>Grosse</snm>
               <fnm>Ivo</fnm>
               <insr iid="I1"/>
               <insr iid="I5"/>
               <email>grosse@informatik.uni-halle.de</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben (IPK), Corrensstra&#223;e 3, 06466 Gatersleben, Germany</p>
            </ins>
            <ins id="I2">
               <p>International Computer Science Institute, 1947 Center Street, Berkeley, California 94704, USA</p>
            </ins>
            <ins id="I3">
               <p>International NRW Graduate School in Bioinformatics and Genome Research, Center for Biotechnology (CeBiTec), Bielefeld University, Universit&#228;tsstra&#223;e 27, 33615 Bielefeld, Germany</p>
            </ins>
            <ins id="I4">
               <p>Institute for Genome Research and Systems Biology (IGS), Center for Biotechnology (CeBiTec), Bielefeld University, Universit&#228;tsstra&#223;e 27, 33615 Bielefeld, Germany</p>
            </ins>
            <ins id="I5">
               <p>Institute of Computer Science, Martin Luther University Halle-Wittenberg, Von-Seckendorff-Platz 1, 06120 Halle, Germany</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2009</pubdate>
         <volume>10</volume>
         <issue>5</issue>
         <fpage>R46</fpage>
         <url>http://genomebiology.com/2009/10/5/R46</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">19409082</pubid>
               <pubid idtype="doi">10.1186/gb-2009-10-5-r46</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>19</day>
               <month>2</month>
               <year>2009</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>17</day>
               <month>4</month>
               <year>2009</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>1</day>
               <month>5</month>
               <year>2009</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>1</day>
               <month>5</month>
               <year>2009</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2009</year>
         <collab>Keilwagen et al., licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <shorttitle>
         <p>MotifAdjuster</p>
      </shorttitle>
      <shortabs>
         <p>MotifAdjuster helps to detect errors in binding site annotations.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>Valuable binding-site annotation data are stored in databases. However, several types of errors can, and do, occur in the process of manually incorporating annotation data from the scientific literature into these databases. Here, we introduce MotifAdjuster <url>http://dig.ipk-gatersleben.de/MotifAdjuster.html</url>, a tool that helps to detect these errors, and we demonstrate its efficacy on public data sets.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Rationale</p>
         </st>
         <p>The regulation of gene expression involves a complex system of interacting components in all living organisms <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> and is of fundamental interest, for instance, for cell maintenance and development. One level of regulation is realized by DNA-binding transcription factors (TFs). The DNA-binding domain of a TF is capable of recognizing specific binding sites (BSs) in the promoter regions of its target genes <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Binding of a TF can induce (activator) or inhibit (repressor) the transcription of its target genes. The general ability to control a target gene may depend on the BS itself, its strand orientation, and its position with respect to the transcription start site. If other BSs are present, the ability of a TF to bind the DNA may additionally depend on strand orientations and positions of these BSs.</p>
         <p>One important prerequisite for research on gene regulation is the reliable annotation of BSs. The approximate regions on the double-stranded DNA sequence bound by TFs can be determined by wet-lab experiments such as electrophoretic mobility shift assays (EMSAs) <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, DNAse footprinting <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, enzyme-linked immunosorbent assay (ELISA) <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>, ChIP-chip <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, or mutations of the putative BS and subsequent expression studies. Because TFs bind to double-stranded DNA, the strand annotations of nonpalindromic BSs in the databases are either missing or added, based on manual inspection or predictions from bioinformatics tools such as MEME <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, Gibbs Sampler <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>, Improbizer <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, SeSiMCMC <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, or A-GLAM <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
         <p>After wet-lab identification, data about transcriptional gene regulatory interactions, including the annotated BSs, are published in the scientific literature. Subsequently, these data are extracted by curation teams and manually entered into databases on transcriptional gene regulation such as CoryneRegNet <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, PRODORIC <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, or RegulonDB <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> for prokaryotes, and AGRIS <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, AthaMap <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, CTCFBSDB <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, JASPAR <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, OregAnno <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>, SCPD <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>, TRANSFAC <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, TRED <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, or TRRD <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> for eukaryotes. Three typical problems may occur during the process of transferring these data.</p>
         <p>First, erroneously annotated BS: This error may occur in the original study or during the transfer process from the scientific literature to the databases. A sequence is declared to contain a BS, although, in reality, it does not.</p>
         <p>Second, shift of the BS: The BS may be erroneously shifted by one or a few base pairs. This typically happens during the transfer process from the scientific literature to the databases.</p>
         <p>Third, missing or wrong strand orientation of the BS: The strand orientation of a BS is often not or incorrectly annotated. For example, all BS orientations are arbitrarily declared to be in 5'&#8594;3' direction relative to the target gene in CoryneRegNet and in RegulonDB <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B16">16</abbr></abbrgrp>.</p>
         <p>These problems can strongly affect any of the subsequent analysis steps, such as the inference of sequence motifs from "experimentally verified" data, the calculation of <it>P </it>values for the occurrence of BSs, the detection of putative BSs in genome-wide scans and their experimental validation, or the reconstruction of transcriptional gene-regulatory networks.</p>
         <p>Here, we introduce MotifAdjuster, a software tool for detecting potential BS annotation errors and for proposing possible corrections. Existing bioinformatics tools <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp> are not optimized for this task (Additional data file 1), because they do not allow shifting the BS by using a nonuniform distribution and considering both strands with unequal weights. In contrast, MotifAdjuster allows the user to incorporate prior knowledge about (i) the probability of erroneously annotated BSs, (ii) the distribution of possible shifts, and (iii) the strand preference.</p>
         <p>One widely-used model for the representation of BSs is the position weight matrix (PWM) model <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>, and many software tools for genome-wide scans of sequence motifs are based on PWM models <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp>. MotifAdjuster is based on a simple mixture model using a PWM model on both strands for the motif sequences and a homogeneous Markov model of order 0 for the flanking sequences similar to MEME, Gibbs Sampler, Improbizer, SeSiMCMC, or A-GLAM. For a given set of BSs, MotifAdjuster tests whether each sequence contains a BS, and it refines the annotations of position and strand for each BS, if necessary, by maximizing the posterior of the mixture model by using a simple <it>expectation maximization </it>(EM) algorithm.</p>
         <p>To test the efficacy of MotifAdjuster, we apply it to seven data sets from CoryneRegNet, and we record for each of them the set of potential annotation errors. For one example, the nitrate regulator NarL, we compare the proposed adjustments with the original literature, with a manual strand reannotation of the BS strands, and with an independent and hand-curated reannotation provided by PRODORIC. Finally, we test whether the PWM estimated from the adjusted NarL BSs can help to detect unknown BSs in those promoter regions that are known to be bound by NarL, but for which no BS could be predicted in the past.</p>
      </sec>
      <sec>
         <st>
            <p>Algorithm</p>
         </st>
         <p>In this section, we present the MotifAdjuster algorithm including the mixture model, the prior, and the maximum <it>a posteriori </it>(MAP) estimation of the model parameters given the data.</p>
         <sec>
            <st>
               <p>Mixture model</p>
            </st>
            <p>We denote a DNA sequence of length <it>L </it>by <it><ul>x</ul></it>:= (<it>x</it><sub>1</sub>, <it>x</it><sub>2</sub>, ..., <it>x</it><sub><it>L</it></sub>), the nucleotide at position <it>&#8467; </it>&#8712; [1, <it>L</it>] by <it>x</it><sub><it>&#8467; </it></sub>&#8712; {<it>A</it>, <it>C</it>, <it>G</it>, <it>T</it>}, and the <it>reverse complement </it>of <it><ul>x</ul></it> by <it><ul>x</ul></it><sup><it>RC</it></sup>. For modeling a BS <it><ul>x</ul></it> of length <it>w</it>, we use a PWM model, which assumes that the nucleotides at all positions are statistically independent of each other, resulting in an additive log-likelihood</p>
            <p>
               <display-formula id="M1">
                  <graphic file="gb-2009-10-5-r46-i1.gif"/>
               </display-formula>
            </p>
            <p>of sequence <it><ul>x</ul></it> given the model parameters <it><ul>&#955;</ul></it>&#160;<abbrgrp><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp>, where the subscript <it>f </it>stands for <it>foreground</it>. Here, <inline-formula><graphic file="gb-2009-10-5-r46-i2.gif"/></inline-formula> denotes the logarithm of the probability of finding nucleotide <it>a </it>&#8712; {<it>A</it>, <it>C</it>, <it>G</it>, <it>T</it>} at position <it>&#8467;</it>, <it><ul>&#955;</ul></it><sup><it>&#8467; </it></sup>denotes the four-dimensional vector <inline-formula><graphic file="gb-2009-10-5-r46-i3.gif"/></inline-formula>, and <it><ul>&#955;</ul></it> denotes the (4 &#215; <it>w</it>) matrix, that is, <it><ul>&#955;</ul></it> denotes the PWM <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>.</p>
            <p>For modeling the flanking sequences, we use a homogeneous Markov model of order 0, which assumes that all nucleotides are statistically independent, resulting in an additive log-likelihood</p>
            <p>
               <display-formula id="M2">
                  <graphic file="gb-2009-10-5-r46-i4.gif"/>
               </display-formula>
            </p>
            <p>of sequence <it><ul>x</ul></it> given model parameters <it><ul>&#964;</ul></it>&#160;<abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>, where the subscript <it>b </it>stands for <it>background</it>. Here, <it>&#964;</it><sub><it>a </it></sub>denotes the logarithm of the probability of nucleotide <it>a</it>, and <it><ul>&#964;</ul></it> denotes the vector (<it>&#964;</it><sub><it>A</it></sub>, ..., <it>&#964;</it><sub><it>T</it></sub>)<sup><it>T</it></sup>.</p>
            <p>For the detection of sequences (i) erroneously annotated as containing BSs, (ii) with shifted BSs, or (iii) with missing or wrong strand annotations, we introduce the three random variables <it>u</it><sub>1</sub>, <it>u</it><sub>2</sub>, and <it>u</it><sub>3</sub>.</p>
            <p>The variable <it>u</it><sub>1 </sub>handles the possibility that a sequence annotated as containing a BS does not contain a BS. <it>u</it><sub>1 </sub>= 0 denotes the case that the sequence contains no BS, and <it>u</it><sub>1 </sub>= 1 denotes the case that the sequence contains exactly one BS. If the sequence contains one BS, it can be located at different positions and on both strands.</p>
            <p>The variable <it>u</it><sub>2 </sub>handles the possibility of shifts of a BS caused by annotation errors. <it>u</it><sub>2 </sub>models the start position of the BS in the sequence with respect to the annotated start position. This variable can assume the integer values {-<it>s</it>, -(<it>s</it>-1), ..., <it>s</it>-1, <it>s</it>}, where <it>s </it>is the maximal shift of the BS upstream or downstream of the annotated position.</p>
            <p>The variable <it>u</it><sub>3 </sub>handles the possibility that a BS can have two orientations in the double-stranded upstream region of the target gene. According to the notation of CoryneRegNet, <it>u</it><sub>3 </sub>= 0 denotes the forward strand defined as the strand in 5'&#8594;3' direction relative to the target gene, and <it>u</it><sub>3 </sub>= 1 denotes the reverse complementary strand.</p>
            <p>For shortness of notation, we define <it><ul>u</ul></it> := (<it>u</it><sub>1</sub>, <it>u</it><sub>2</sub>, <it>u</it><sub>3</sub>). Because we do not know the values of <it><ul>u</ul></it>, these variables are modeled as hidden variables. We assume that <it>u</it><sub>2 </sub>and <it>u</it><sub>3 </sub>are conditionally independent of each other given <it>u</it><sub>1</sub>; that is, we assume that annotation errors of position and strand are conditionally independent given the occurrence of the BS. We define</p>
            <p>
               <display-formula id="M3">
                  <graphic file="gb-2009-10-5-r46-i5.gif"/>
               </display-formula>
            </p>
            <p>where the subscript <it>h </it>stands for <it>hidden</it>, and where <it><ul>&#632; </ul></it>:= (<it><ul>&#632;</ul></it><sub>1</sub>, <it><ul>&#632;</ul></it><sub>2</sub>, <it><ul>&#632;</ul></it><sub>3</sub>) denotes the vector of parameters of this distribution. MotifAdjuster allows the user to specify the probability <it>P</it><sub><it>h </it></sub>(<it>u</it><sub>1</sub>|<it><ul>&#632;</ul></it><sub>1</sub>) that a sequence contains (or does not contain) a BS and the probability distribution <it>P</it><sub><it>h </it></sub>(<it>u</it><sub>2</sub>|<it>u</it><sub>1</sub>, <it><ul>&#632;</ul></it><sub>2</sub>) for the length of the erroneous shift. In addition, MotifAdjuster estimates the logarithm of the probability that the BS is located on the forward (<it>v </it>= 0) or the reverse complementary (<it>v </it>= 1) strand, <inline-formula><graphic file="gb-2009-10-5-r46-i21.gif"/></inline-formula>, from the user-provided data as described in subsection <it>Expectation maximization algorithm</it>.</p>
            <p>The hidden values of <it><ul>u</ul></it> lead to the likelihood</p>
            <p>
               <display-formula id="M4">
                  <graphic file="gb-2009-10-5-r46-i6.gif"/>
               </display-formula>
            </p>
            <p>of the data <it><ul>x</ul></it> given the model parameters (<it><ul>&#955;</ul></it>, <it><ul>&#964;</ul></it>, <it><ul>&#632;</ul></it>), where the sum runs over all possible values of <it><ul>u</ul></it>. Here, the subscript <it>a </it>stands for <it>accumulated</it>, and the subscript <it>c </it>stands for <it>composite</it>. In the following, we define the likelihood in close analogy to <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B37">37</abbr></abbrgrp>. If sequence <it><ul>x</ul></it> contains no BS, we assume that <it><ul>x</ul></it> is generated by a homogeneous Markov model of order 0; that is,</p>
            <p>
               <display-formula id="M5">
                  <graphic file="gb-2009-10-5-r46-i7.gif"/>
               </display-formula>
            </p>
            <p>If the sequence <it><ul>x</ul></it> contains a BS, then <it>u</it><sub>2 </sub>encodes its start position, <it>u</it><sub>3 </sub>encodes its strand, and we assume that the nucleotides upstream and downstream of the BS are generated by a homogeneous Markov model of order 0, yielding</p>
            <p>
               <display-formula id="M6">
                  <graphic file="gb-2009-10-5-r46-i8.gif"/>
               </display-formula>
            </p>
            <p>and</p>
            <p>
               <display-formula id="M7">
                  <graphic file="gb-2009-10-5-r46-i9.gif"/>
               </display-formula>
            </p>
            <p>where the subscript <it>m </it>stands for <it>motif</it>.</p>
         </sec>
         <sec>
            <st>
               <p>Prior</p>
            </st>
            <p>As prior of the parameters of the PWM model, we use the "common choice" <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp> of a product of transformed Dirichlets</p>
            <p>
               <display-formula id="M8">
                  <graphic file="gb-2009-10-5-r46-i10.gif"/>
               </display-formula>
            </p>
            <p>where <inline-formula><graphic file="gb-2009-10-5-r46-i11.gif"/></inline-formula> denotes the positive hyperparameter of <inline-formula><graphic file="gb-2009-10-5-r46-i2.gif"/></inline-formula>, <inline-formula><graphic file="gb-2009-10-5-r46-i12.gif"/></inline-formula> denotes the equivalent sample size (ESS) at position <it>&#8467;</it>, which we set to be equal at each position, <it><ul>&#945;</ul></it><sup><it>&#8467; </it></sup>denotes the four-dimensional vector <inline-formula><graphic file="gb-2009-10-5-r46-i13.gif"/></inline-formula>, and <it><ul>&#945;</ul></it> denotes the (4 &#215; <it>w</it>) matrix (<it><ul>&#945;</ul></it><sub>1</sub>, ..., <it><ul>&#945;</ul></it><sub><it>w</it></sub>).</p>
            <p>The choice of this prior is pragmatic rather than biologically motivated. This prior is conjugate to the likelihood, allowing to write the posterior as a product of transformed Dirichlets. As PWM models are special cases of Bayesian networks, the chosen prior can be understood as a special case of the Bayesian Dirichlet (BD) prior <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>.</p>
            <p>Analogously, for homogeneous Markov models of order 0, we choose a transformed Dirichlet <it>P</it>(<it><ul>&#964;</ul></it>|<it><ul>&#946;</ul></it>) := D(<it><ul>&#964;</ul></it>|<it><ul>&#946;</ul></it>), where <it>&#946;</it><sub><it>a </it></sub>denotes the positive hyperparameter of <it>&#964;</it><sub><it>a</it></sub>.</p>
            <p>MotifAdjuster allows the user to specify <it>P</it>(<it>u</it><sub>1</sub>|<it><ul>&#632;</ul></it><sub>1</sub>) and <it>P</it>(<it>u</it><sub>2</sub>|<it>u</it><sub>1</sub>, <it><ul>&#632;</ul></it><sub>2</sub>). In principle, MotifAdjuster allows the user to specify any probability distribution <it>P</it>(<it>u</it><sub>2</sub>|<it>u</it><sub>1</sub>, <it><ul>&#632;</ul></it><sub>2</sub>) for the length of the erroneous shift, allowing also asymmetric or bimodal distributions, if needed. For an easy and user-friendly execution, MotifAdjuster also offers a discrete and symmetrically truncated Gaussian distribution defined by</p>
            <p>
               <display-formula id="M9">
                  <graphic file="gb-2009-10-5-r46-i14.gif"/>
               </display-formula>
            </p>
            <p>where <it>z </it>is an integer value ranging from <it>-s </it>to <it>s</it>. The real-valued parameter <it>&#963; </it>is similar to the standard deviation of a Gaussian distribution and can be specified by the user, and we denote <it><ul>&#632;</ul></it><sub>2 </sub>:= (<it>s</it>, <it>&#963;</it>).</p>
            <p>We expect that some sequences are annotated to contain a BS, although they do not contain a BS in reality, but we believe that the fraction of such incorrectly annotated sequences is small. Hence, we choose <it>P</it>(<it>u</it><sub>1 </sub>= 0|<it><ul>&#632;</ul></it><sub>1</sub>)=0.2 for the studies presented in this article; that is, we assume that only 20% of the sequences annotated to contain a BS do not contain a BS in reality. We further expect that the annotated position of the BS might be shifted accidentally by a few base pairs, so we choose <it>s </it>= 5 and a discrete and symmetrically truncated Gaussian distribution with <it>&#963; </it>= 1. This choice results in a conditional probability of approximately 40% that the BS is not shifted, of approximately 25% that it is shifted 1 bp, and of approximately 5% that it is shifted by more than 1 bp upstream or downstream of the annotated start position, respectively, given that a BS is present in sequence <it><ul>x</ul></it>.</p>
            <p>As prior of the parameter <it><ul>&#632;</ul></it><sub>3</sub>, we choose a transformed Dirichlet <it>P</it>(<it><ul>&#632;</ul></it><sub>3</sub>|<it><ul>&#947;</ul></it>) := <it>D</it>(<it><ul>&#632;</ul></it><sub>3</sub>|<it><ul>&#947;</ul></it>) with <it><ul>&#947;</ul></it> = (<it>&#947;</it><sub>0</sub>, <it>&#947;</it><sub>1</sub>), where <it>&#947;</it><sub><it>v </it></sub>denotes the positive hyperparameter of <it>&#632;</it><sub>3,<it>v </it></sub>with <it>v </it>&#8712; {0, 1}.</p>
            <p>Putting all pieces together, we define the prior of the parameters of the mixture model of Equation (4) by:</p>
            <p>
               <display-formula id="M10">
                  <graphic file="gb-2009-10-5-r46-i15.gif"/>
               </display-formula>
            </p>
            <p>stating that we assume <it><ul>&#955;</ul></it>, <it><ul>&#964;</ul></it>, and <it><ul>&#632;</ul></it><sub>3 </sub>to be statistically independent.</p>
            <p>We denote the ESS of the mixture model chosen before inspecting any database by <it>&#949;</it>, and we set the ESS of the PWM model to <it>P</it>(<it>u</it><sub>1 </sub>= 1|<it><ul>&#632;</ul></it><sub>1</sub>)&#183;<it>&#949;</it>, the positive hyperparameters of the strand parameters to <inline-formula><graphic file="gb-2009-10-5-r46-i16.gif"/></inline-formula>, and the ESS of the homogeneous Markov model of order 0 to (<it>L</it> - <it>P</it>(<it>u</it><sub>1 </sub>= 1|<it><ul>&#632;</ul></it><sub>1</sub>)&#183;<it>w</it>)&#183;<it>&#949;</it>. For the reassessment of BSs presented in this article, we choose an ESS of <it>&#949; </it>= 5, yielding an ESS of 4 for the PWM model, <it>&#947;</it><sub>0 </sub>= <it>&#947;</it><sub>1 </sub>= 2, and an ESS of 57 for the homogeneous Markov model of order 0. This choice yields <inline-formula><graphic file="gb-2009-10-5-r46-i22.gif"/></inline-formula> for every <it>a</it> &#8712; {<it>A</it>, <it>C</it>, <it>G</it>, <it>T</it>} and every <it>&#8467;</it> &#8712; [1, <it>w</it>], stating that the chosen prior of the PWM model can be understood as a special case of the BDeu prior <abbrgrp><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp>, which in turn is a special case of the BD prior.</p>
         </sec>
         <sec>
            <st>
               <p>Expectation maximization algorithm</p>
            </st>
            <p>The model parameters of the mixture model defined by Equation (4) cannot be estimated analytically, but any numeric optimization algorithm can be used for maximizing the posterior. One popular optimization algorithm for maximizing the likelihood <it>P</it>(S|<it><ul>&#955;</ul></it>, <it><ul>&#964;</ul></it>, <it><ul>&#632;</ul></it>) is the EM algorithm <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. The EM algorithm can be easily modified for maximizing the posterior <it>P</it>(<it><ul>&#955;</ul></it>, <it><ul>&#964;</ul></it>, <it><ul>&#632;</ul></it>|<it>S</it>, <it><ul>&#945;</ul></it>, <it><ul>&#946;</ul></it>, <it><ul>&#947;</ul></it>) of the data set <it>S </it>by iteratively maximizing:</p>
            <p>
               <display-formula id="M11">
                  <graphic file="gb-2009-10-5-r46-i17.gif"/>
               </display-formula>
            </p>
            <p>with</p>
            <p>
               <display-formula id="M12">
                  <graphic file="gb-2009-10-5-r46-i18.gif"/>
               </display-formula>
            </p>
            <p>Q(<it><ul>&#955;</ul></it>, <it><ul>&#964;</ul></it>, <it><ul>&#632;</ul></it>, <it><ul>&#955;</ul></it><sup>(<it>t</it>)</sup>, <it><ul>&#964;</ul></it><sup>(<it>t</it>)</sup>, <it><ul>&#632;</ul></it><sup>(<it>t</it>)</sup>|<it><ul>&#945;</ul></it>, <it><ul>&#946;</ul></it>, <it><ul>&#947;</ul></it>) can be maximized analytically with respect to <it><ul>&#955;</ul></it>, <it><ul>&#964;</ul></it>, and <it><ul>&#632;</ul></it><sub>3</sub>, yielding the familiar expressions provided in Additional data file 2. The posterior <it>P</it>(<it><ul>&#955;</ul></it>, <it><ul>&#964;</ul></it>, <it><ul>&#632;</ul></it>|<it>S</it>, <it><ul>&#945;</ul></it>, <it><ul>&#946;</ul></it>, <it><ul>&#947;</ul></it>) increases monotonically with each iteration, implying that the modified EM algorithm converges to the global maximum, a local maximum, or a saddle point. We stop the algorithm if the logarithmic increase of the posterior between two subsequent iterations becomes smaller than 10<sup>-6</sup>, restart the algorithm 10 times with randomly chosen initial values of <inline-formula><graphic file="gb-2009-10-5-r46-i19.gif"/></inline-formula>, and choose the parameters of that start with the highest posterior, similar to <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B37">37</abbr></abbrgrp>. If we restrict <it>P</it><sub><it>h</it></sub>(<it>u</it><sub>2</sub>|<it>u</it><sub>1</sub>, <it><ul>&#632;</ul></it><sub>2</sub>) to a uniform distribution over all possible start positions, if we set <it>P</it><sub><it>h</it></sub>(<it>u</it><sub>3</sub>|<it>u</it><sub>1 </sub>= 1) = 0.5, and if we restrict the background model to be strand symmetric, then we obtain the probabilistic model that is the basis of <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B37">37</abbr></abbrgrp>.</p>
            <p>The flexibility allowed by MotifAdjuster is important for its practical applicability. Typically, the user has prior knowledge about (i) the expected motif occurrence and (ii) the shift distribution, but (iii) no or only limited prior knowledge about the distribution of the BS strand orientation. Hence, we allow the user to specify the logarithm of the probability that a sequence contains a BS <it>&#632;</it><sub>1,0</sub>, a nonuniform distribution to incorporate the prior knowledge of the shift distribution, and we estimate the logarithm of the probability that the BS is located on the forward strand <it>&#632;</it><sub>3,0 </sub>from the data. This setting allows MotifAdjuster to work, without additional intervention, also in the two extreme cases that the BSs lie predominantly either on the forward or on the reverse complementary strand.</p>
            <p>Because of the open source license of MotifAdjuster, similar mixture models can be derived and implemented easily, for instance, by using other background and motif models such as Markov models of higher order <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>, Permuted Markov models <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>, Bayesian networks <abbrgrp><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr></abbrgrp>, or their extensions to variable order <abbrgrp><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr><abbr bid="B53">53</abbr></abbrgrp>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Case studies</p>
         </st>
         <p>In this section we present the results of MotifAdjuster applied to seven data sets of <it>Escherichia coli</it>, the validation of MotifAdjuster results for NarL BSs, and the prediction of a novel NarL BS.</p>
         <sec>
            <st>
               <p>Results for seven data sets of <it>Escherichia coli</it></p>
            </st>
            <p>For testing the efficacy of MotifAdjuster and improving the annotation of BSs of <it>Escherichia coli</it>, we extract all data sets with at least 30 BSs of length of at most 25 bp from the bacterial gene-regulatory reference database CoryneRegNet 4.0. The choice of at least 30 BSs of length of at most 25 bp is arbitrary, but motivated by the intention that the results of the following study should not be influenced by TFs with an insufficient number of BSs or by TFs with an atypical BS length. Seven data sets of BSs corresponding to the TFs CpxR, Crp, Fis, Fnr, Fur, Lrp, and NarL satisfy these requirements, and we apply MotifAdjuster to each of these seven data sets. We summarize the results obtained by MotifAdjuster in Table <tblr tid="T1">1</tblr>, and we provide a complete list of the results in Additional data file 3.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Annotation results</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="left">
                        <p>Gene ID</p>
                     </c>
                     <c ca="left">
                        <p>Gene name</p>
                     </c>
                     <c ca="right">
                        <p>No. BS</p>
                     </c>
                     <c ca="right">
                        <p>BS length</p>
                     </c>
                     <c ca="right">
                        <p>No. removed BSs</p>
                     </c>
                     <c ca="right">
                        <p>No. shifted BSs</p>
                     </c>
                     <c ca="right">
                        <p>Percentage</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>b3357</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>crp</it>
                        </p>
                     </c>
                     <c ca="right">
                        <p>218</p>
                     </c>
                     <c ca="right">
                        <p>22</p>
                     </c>
                     <c ca="right">
                        <p>20</p>
                     </c>
                     <c ca="right">
                        <p>31</p>
                     </c>
                     <c ca="right">
                        <p>23.4%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>b1221</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>narL</it>
                        </p>
                     </c>
                     <c ca="right">
                        <p>74</p>
                     </c>
                     <c ca="right">
                        <p>7</p>
                     </c>
                     <c ca="right">
                        <p>2</p>
                     </c>
                     <c ca="right">
                        <p>11</p>
                     </c>
                     <c ca="right">
                        <p>17.6%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>b3261</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>fis</it>
                        </p>
                     </c>
                     <c ca="right">
                        <p>68</p>
                     </c>
                     <c ca="right">
                        <p>21</p>
                     </c>
                     <c ca="right">
                        <p>13</p>
                     </c>
                     <c ca="right">
                        <p>17</p>
                     </c>
                     <c ca="right">
                        <p>44.1%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>b1334</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>fnr</it>
                        </p>
                     </c>
                     <c ca="right">
                        <p>54</p>
                     </c>
                     <c ca="right">
                        <p>14</p>
                     </c>
                     <c ca="right">
                        <p>2</p>
                     </c>
                     <c ca="right">
                        <p>3</p>
                     </c>
                     <c ca="right">
                        <p>9.3%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>b0683</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>fur</it>
                        </p>
                     </c>
                     <c ca="right">
                        <p>46</p>
                     </c>
                     <c ca="right">
                        <p>15</p>
                     </c>
                     <c ca="right">
                        <p>1</p>
                     </c>
                     <c ca="right">
                        <p>43</p>
                     </c>
                     <c ca="right">
                        <p>95.7%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>b0889</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>lrp</it>
                        </p>
                     </c>
                     <c ca="right">
                        <p>43</p>
                     </c>
                     <c ca="right">
                        <p>12</p>
                     </c>
                     <c ca="right">
                        <p>4</p>
                     </c>
                     <c ca="right">
                        <p>23</p>
                     </c>
                     <c ca="right">
                        <p>62.8%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>b3912</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>cpxR</it>
                        </p>
                     </c>
                     <c ca="right">
                        <p>33</p>
                     </c>
                     <c ca="right">
                        <p>15</p>
                     </c>
                     <c ca="right">
                        <p>9</p>
                     </c>
                     <c ca="right">
                        <p>6</p>
                     </c>
                     <c ca="right">
                        <p>45.5%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="right">
                        <p>536</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="right">
                        <p>51</p>
                     </c>
                     <c ca="right">
                        <p>134</p>
                     </c>
                     <c ca="right">
                        <p>34.5%</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Summary of the results of the application of MotifAdjuster to all data sets of CoryneRegNet 4.0 from <it>Escherichia coli </it>with at least 30 BSs and of at most 25 bp length. Columns 1 and 2 show the gene ID and gene name of the TF; columns 3 and 4 show the number of BSs stored in the database and their lengths; columns 5 and 6 show the number of BSs proposed to be removed and to be shifted; and column 7 shows the percentage of BSs to be removed or shifted. Interestingly, the percentage of proposed adjustments varies strongly from TF to TF, ranging from 9.3% for Fnr to 95.7% for Fur. In summary, we find in the complete data set of 536 BSs that 51 BSs are proposed to be removed and 134 BSs are proposed to be shifted, resulting in 34.5% of the data set being proposed for adjustments.</p>
               </tblfn>
            </tbl>
            <p>We find that all of the data sets are considered questionable by MotifAdjuster and, more surprisingly, that 34.5% of the 536 BS annotations are proposed for removal or shifts. The percentage of questionably annotated BSs ranges from 9.3% for Fnr to 95.7% for Fur. MotifAdjuster proposes to remove 51 of the 536 BSs and to shift 134 of the remaining 485 BSs by at least one bp, indicating that, in these seven data sets, erroneous shifts of the annotated BSs are the most frequent annotation error. In particular, the percentage of proposed deletions ranges from 2.2% (one of 46) for Fur to 27.3% (nine of 33) for CpxR, and the percentage of proposed shifts ranges from 5.6% (three of 54) for Fnr to 93.5% (43 of 46) for Fur. In more detail, we observe a broad range of shift lengths ranging from one shift 4 bp upstream to two shifts 4 bp downstream, with a sharp peak about 0.</p>
            <p>For each of the seven TFs, we analyze whether the adjustments proposed by MotifAdjuster result in an improved motif of the BSs (Figure <figr fid="F1">1</figr>). We compute the sequence logos <abbrgrp><abbr bid="B54">54</abbr><abbr bid="B55">55</abbr></abbrgrp> of the original BSs obtained from CoryneRegNet and those of the BSs proposed by MotifAdjuster, which we call original sequence logos and adjusted sequence logos, respectively. Comparing these sequence logos, we find that the adjusted sequence logos show a higher conservation than the original sequence logos in all seven cases. We also compare the sequence logos with consensus sequences obtained from the literature <abbrgrp><abbr bid="B56">56</abbr><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr><abbr bid="B61">61</abbr></abbrgrp>, and we find that the adjusted sequence logos are more similar to the consensus sequences than the original sequence logos. In addition, we find, for the TFs CpxR, Fur, and NarL, that the adjusted sequence logos allow us to recognize clear motifs that could not be recognized in the original sequence logos obtained from CoryneRegNet.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Comparison of binding-site conservation, showing the original sequence logos, the consensus sequences for the TFs obtained from the literature <abbrgrp><abbr bid="B56">56</abbr><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr><abbr bid="B61">61</abbr></abbrgrp>, and the adjusted sequence logos for the data sets of the TFs CpxR, Crp, Fis, Fnr, Fur, Lrp, and NarL</p>
               </caption>
               <text>
                  <p>Comparison of binding-site conservation, showing the original sequence logos, the consensus sequences for the TFs obtained from the literature <abbrgrp><abbr bid="B56">56</abbr><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr><abbr bid="B61">61</abbr></abbrgrp>, and the adjusted sequence logos for the data sets of the TFs CpxR, Crp, Fis, Fnr, Fur, Lrp, and NarL. We find in all seven cases that (i) the adjusted sequence logos show a higher conservation than the original sequence logos, (ii) the adjusted sequence logos are more similar to the consensus sequences than to the original sequence logos; and (iii) clear motifs can be recognized in the adjusted sequence logos of the TFs CpxR, Fur, and NarL that could not be recognized in the original sequence logos.</p>
               </text>
               <graphic file="gb-2009-10-5-r46-1"/>
            </fig>
            <p>We investigate whether there exists any systematic dependence of the observed rate of proposed adjustments exists on the number of BSs, the BS length, and the GC content of the BSs. We find no obvious dependence of the error rate on the number of BSs and on the BS length. Comparing the GC content of the BSs, we find that the GC content of the BSs of all but one TF ranges from 30% to 40%. However, the GC content of the Fur BSs is only 20%. This low GC content might be the reason for the unexpectedly high percentage of shifts in this data set, because it is more likely to shift a BS accidentally in a sequence composed of a virtually binary alphabet.</p>
         </sec>
         <sec>
            <st>
               <p>Validation of MotifAdjuster results for NarL</p>
            </st>
            <p>To evaluate the previous results, we choose NarL as example and scrutinize the proposed reannotations of MotifAdjuster for this case. The nitrate regulator NarL of <it>Escherichia coli </it>is one of the key factors controlling the upregulation of the nitrate respiratory pathway and the downregulation of other respiratory chains. In the absence of oxygen, the energetically most efficient anaerobic respiratory chain uses nitrate and nitrite as electron acceptors <abbrgrp><abbr bid="B62">62</abbr></abbrgrp>. Detection of and adaptation to extracellular nitrate levels are accomplished by complex interactions of a double two-component regulatory system, which consists of the homologous sensory proteins NarQ and NarX, and the homologous TFs NarL and NarP. Depending on the BS arrangement and localization relative to the transcription start site, NarL and NarP act as activators or repressors, thereby enabling a flexible control of the expression of nearly 100 genes.</p>
            <p>CoryneRegNet stores 74 NarL BSs, each of length 7 bp (Table <tblr tid="T1">1</tblr>). Of these 74 BSs, only 36 are considered accurate by MotifAdjuster, whereas 38 are considered to be questionable. In 25 cases, MotifAdjuster proposes to switch the strand orientation of the BS; in five cases, it proposes to shift the location of the BS, and for six BSs, it proposes both a switch of strand orientation and a shift of position. In addition, two BSs are proposed for removal. We present a summary of these results in Table <tblr tid="T2">2</tblr>, we provide a complete list of the results in Additional data file 4, and we summarize in Table <tblr tid="T3">3</tblr> those 13 BSs of the regulator NarL where MotifAdjuster proposes to shift the location of the BS or to remove it from the databases.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>NarL annotation results: Number of binding-site shifts and strand switches</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>No strand switch</p>
                     </c>
                     <c ca="center">
                        <p>Strand switch</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>No position shift</p>
                     </c>
                     <c ca="center">
                        <p>36</p>
                     </c>
                     <c ca="center">
                        <p>25</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Position shift</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Removed</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>2</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Application of MotifAdjuster to the set of 74 NarL BSs results in adjustments proposed for 38 of these BSs. Two BSs are proposed to be removed from the data set. Of the remaining 36 BSs, 25 BSs are labeled with a wrong strand annotation but a correct position, and five BSs are proposed to have a correct strand annotation but a wrong position. For six BSs, both strand annotation and position are proposed to be wrong.</p>
               </tblfn>
            </tbl>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>NarL binding sites with questionable annotations</p>
               </caption>
               <tblbdy cols="8">
                  <r>
                     <c ca="center">
                        <p>Gene ID</p>
                     </c>
                     <c ca="center">
                        <p>Gene name</p>
                     </c>
                     <c ca="center">
                        <p>BS</p>
                     </c>
                     <c ca="center">
                        <p>Lit.</p>
                     </c>
                     <c ca="center">
                        <p>Occ.</p>
                     </c>
                     <c ca="center">
                        <p>Shift</p>
                     </c>
                     <c ca="center">
                        <p>Strand</p>
                     </c>
                     <c ca="center">
                        <p>Adj. BS</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b0904</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>focA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>AATAAAT</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B63">63</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>Reverse</p>
                     </c>
                     <c ca="center">
                        <p>TATTTAT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b0904</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>focA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>ATAATGC</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B63">63</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>Forward</p>
                     </c>
                     <c ca="center">
                        <p>TAATGCT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b0904</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>focA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>ATATCAA</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B63">63</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>Forward</p>
                     </c>
                     <c ca="center">
                        <p>TATCAAT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b0904</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>focA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>CAACTCA</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B63">63</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>Forward</p>
                     </c>
                     <c ca="center">
                        <p>AACTCAT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b0904</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>focA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>CATTAAT</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B63">63</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>Reverse</p>
                     </c>
                     <c ca="center">
                        <p>TATTAAT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b0904</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>focA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>GATCGAT</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B63">63</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>Reverse</p>
                     </c>
                     <c ca="center">
                        <p>TATCGAT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b0904</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>focA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>GTAATTA</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B63">63</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>Forward</p>
                     </c>
                     <c ca="center">
                        <p>TAATTAT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b0904</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>focA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>TATCGGT</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B63">63</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>Reverse</p>
                     </c>
                     <c ca="center">
                        <p>TACCGAT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b0904</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>focA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>TTACTCC</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B63">63</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>Forward</p>
                     </c>
                     <c ca="center">
                        <p>TACTCCG</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b1223</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>narK</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>CACTGTA</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B64">64</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b1224</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>narG</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>TAGGAAT</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B64">64</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>Reverse</p>
                     </c>
                     <c ca="center">
                        <p>AATTCCT</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b4070</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>nrfA</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>TGTGGTT</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B65">65</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>Reverse</p>
                     </c>
                     <c ca="center">
                        <p>TAACCAC</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>b4123</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>dcuB</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>ATGTTAT</p>
                     </c>
                     <c ca="center">
                        <p>
                           <abbrgrp>
                              <abbr bid="B66">66</abbr>
                           </abbrgrp>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Annotated NarL BSs for which MotifAdjuster proposes either to shift the BS or to remove it from the data set. Columns 1 to 3 contain gene ID, gene name, and the BS (as stored in the database). Column 4 indicates the original literature related to this BS. The following three columns (5 through 7) comprise the three possible adjustments suggested by MotifAdjuster, removal, shift, and strand orientation (relative to the target gene). In column 5, a value of 0 indicates that the BS is proposed for removal, and in column 6, a positive (negative) value denotes a shift of the BS to the right (left). Finally, column 8 provides the adjusted BS. Interestingly, we find that the two BSs that are proposed to be removed are not mentioned in the original literature, and in 10 of the 11 cases, the shifted BS is consistent with the BS published in the original literature. In addition, MotifAdjuster also proposes to switch the BS strand in six of the 11 cases.</p>
               </tblfn>
            </tbl>
            <p>To evaluate the accuracy of MotifAdjuster, we check the original literature <abbrgrp><abbr bid="B63">63</abbr><abbr bid="B37">37</abbr></abbrgrp> for each of the 13 questionable BS candidates. Comparing both, we find that the proposed annotations agree with those in the literature in all cases but one (BS of gene <it>b1224</it>). That is, in 12 of 13 cases signaled by MotifAdjuster as being questionable, the detected error was indeed caused by an inaccurate transfer from the original literature into the gene-regulatory databases RegulonDB and CoryneRegNet. Of those 12 questionable BSs, 10 BSs are correctly proposed to be shifted, and two are correctly proposed to be removed.</p>
            <p>Turning to the BS of the gene <it>b1224</it>, we find it is published as given in the databases <abbrgrp><abbr bid="B64">64</abbr></abbrgrp>, in contrast to the proposal of MotifAdjuster. However, Darwin <it>et al</it>. <abbrgrp><abbr bid="B67">67</abbr></abbrgrp> report that a mutation of this BS has little or no effect on the expression of <it>b1224</it>. Hence, the proposal could possibly be correct, and the BS could be shifted or even be deleted.</p>
            <p>In addition, MotifAdjuster checks the strand annotation of BSs and proposes strand switches if needed. To validate these annotations, we cannot use the annotations from RegulonDB and CoryneRegNet, because these databases contain all BSs in 5'&#8594;3' direction relative to the target gene. Hence, we consult annotation experts at the Center for Biotechnology in Bielefeld to reannotate the strand orientation of the BSs manually, and we compare the results with those of MotifAdjuster. Interestingly, we find that the strand orientations proposed by MotifAdjuster are in perfect (100%) agreement with the manually-curated strand orientations. As an independent test of the efficacy of MotifAdjuster for NarL BSs, we use the manually annotated BSs provided by the PRODORIC database <abbrgrp><abbr bid="B68">68</abbr></abbrgrp>. Remarkably, we find also in this case that the results of MotifAdjuster perfectly agree with the annotations.</p>
            <p>Another hint that the proposed adjustments of MotifAdjuster could be reasonable is based on the observation that NarL and NarP homodimers bind to a 7-2-7' BS arrangement <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>, an inverted repeat structure consisting of a BS on the forward strand, a 2-bp spacer, and a BS on the reverse complementary strand. NarP exclusively binds as homodimer to this 7-2-7' structure. NarL homodimers bind at 7-2-7' sites with high-affinity, but NarL monomers can also bind to a variety of other heptamer arrangements. Instances of this 7-2-7' structure have been reported for four genes: <it>fdnG</it>, <it>napF</it>, <it>nirB</it>, and <it>nrfA </it><abbrgrp><abbr bid="B61">61</abbr><abbr bid="B65">65</abbr></abbrgrp>. In contrast to this observation, all BSs in CoryneRegNet as well as RegulonDB are annotated to be on the forward strand, including the second half of the inverted repeat. When applied to these four genes, MotifAdjuster proposes all heptamers of the second half of the 7-2-7' structure to be switched to the reverse strand, in agreement with <abbrgrp><abbr bid="B61">61</abbr><abbr bid="B65">65</abbr></abbrgrp>. In addition, MotifAdjuster proposes six additional 7-2-7' BS arrangements, located in the upstream regions of the genes <it>adhE</it>, <it>aspA</it>, <it>dcuS</it>, <it>frdA</it>, <it>hcp</it>, and <it>norV</it>. The positions and the orientations are presented in Additional data file 4.</p>
         </sec>
         <sec>
            <st>
               <p>Prediction of a novel NarL binding site</p>
            </st>
            <p>After investigating to which degree MotifAdjuster is capable of finding errors in existing gene-regulatory databases, it is interesting to test whether MotifAdjuster could be helpful for finding novel BSs. The flexibility of BS arrangements and the low motif conservation complicate the computational and manual prediction of NarL BSs by curation teams. This results in several cases in which promoter regions are experimentally verified to be bound by NarL, but in which no NarL BS could be detected <abbrgrp><abbr bid="B69">69</abbr><abbr bid="B70">70</abbr></abbrgrp>. Examples of such genes are <it>caiF </it><abbrgrp><abbr bid="B71">71</abbr></abbrgrp>, <it>torC </it><abbrgrp><abbr bid="B72">72</abbr></abbrgrp>, <it>nikA </it><abbrgrp><abbr bid="B73">73</abbr></abbrgrp>, <it>ubiC </it><abbrgrp><abbr bid="B74">74</abbr></abbrgrp>, and <it>fdhF </it><abbrgrp><abbr bid="B75">75</abbr></abbrgrp>. We extract the upstream regions of these genes, where an upstream sequence is defined by CoryneRegNet as the sequence between positions -560 bp and +20 bp relative to the first position of the annotated start codon of the first gene of the target operon. In addition, we extract those upstream regions of <it>Escherichia coli </it>that belong to operons not annotated as being regulated by NarL (background data set).</p>
            <p>We investigate whether we can now detect NarL BSs based on the adjusted data set that could not be detected based on the original data set from CoryneRegNet. For that purpose, we estimate the parameters <it><ul>&#955;</ul></it> of the PWM model on the adjusted data set as proposed by MotifAdjuster and <it><ul>&#964;</ul></it> of the homogeneous Markov model on the background data set. From the adjusted PWM, we build a mixture model over both strands with the same probability for each strand; that is, exp(<it>&#632;</it><sub>3,0</sub>) = exp(<it>&#632;</it><sub>3,1</sub>) = 0.5. For the classification of an unknown heptamer <it><ul>x</ul></it>, we build a simple likelihood-ratio classifier with these parameters <it><ul>&#955;</ul></it>, <it><ul>&#964;</ul></it>, <it><ul>&#632;</ul></it><sub>3 </sub>and define the log-likelihood ratio by</p>
            <p>
               <display-formula id="M13">
                  <graphic file="gb-2009-10-5-r46-i20.gif"/>
               </display-formula>
            </p>
            <p>For an upstream region, we compute <it>r</it><sub><it>max </it></sub>defined as the highest log-likelihood ratio of any heptamer <it><ul>x</ul></it> in this upstream region. We compute the <it>P </it>value of a potential BS <it><ul>x</ul></it> with value <it>r</it>(<it><ul>x</ul></it>) as fraction of the background sequences whose <it>r</it><sub><it>max</it></sub>-values exceed <it>r</it>(<it><ul>x</ul></it>).</p>
            <p>With this classifier, a significant NarL BS can now be detected in the upstream region of <it>torC</it>. Figure <figr fid="F2">2a</figr> shows the double-stranded DNA fragment with the predicted BS (TACCCCT) located on the forward strand starting at -209 bp relative to the start codon, and at -181 bp relative to the annotated transcription start site <abbrgrp><abbr bid="B76">76</abbr></abbrgrp>. The distance of the predicted BS to the start codon agrees with the distance distribution of previously known NarL BS (Figure <figr fid="F2">2b</figr>), providing additional evidence for the predicted BS. This finding closes the gap between sequence-analysis and gene-expression studies, as the <it>torCAD </it>operon consists of three genes that are essential for the trimethylamine <it>N</it>-oxide (TMAO) respiratory pathway <abbrgrp><abbr bid="B76">76</abbr></abbrgrp>. TMAO is present as an osmoprotector in tissues of invertebrates and can be used as respiratory electron acceptor by <it>Escherichia coli</it>. Transcriptional regulation of this operon by NarL binding to the proposed BS would explain nitrate-dependent repression of TMAO-terminal reductase (TorA) activity under anaerobic conditions <abbrgrp><abbr bid="B72">72</abbr></abbrgrp>, thereby linking TMAO and nitrate respiration.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Position of the predicted NarL binding site in the upstream region of <it>torC</it></p>
               </caption>
               <text>
                  <p>Position of the predicted NarL binding site in the upstream region of <it>torC</it>. The NarL BS TACCCT is located on the forward strand with respect to the target operon <it>torCAD </it>starting at position -209 bp (red color). All positions are relative to the first nucleotide of the start codon of <it>torC</it>. <b>(a) </b>The fragment of the upstream region of the <it>torCAD </it>operon containing the NarL BS predicted by the PWM model trained on the adjusted data set. <b>(b) </b>Histogram of all positions of NarL BSs in the database. The red line indicates the position of the predicted BS.</p>
               </text>
               <graphic file="gb-2009-10-5-r46-2"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <p>Gene-regulatory databases, such as AGRIS, AthaMap, CoryneRegNet, CTCFBSDB, JASPAR, ORegAnno, PRODORIC, RegulonDB, SCPD, TRANSFAC, TRED, or TRRD store valuable information about gene-regulatory networks, including TFs and their BSs. These BSs are usually manually extracted from the original literature and subsequently stored in databases. The whole pipeline of wet-lab BS identification and annotation, publication, and manual transfer from the scientific literature to data repositories is not just time consuming but also error prone, leading to many false annotations currently present in databases.</p>
         <p>MotifAdjuster is a software tool that supports the (re-)annotation process of BSs <it>in silico</it>. It can be applied as a quality-assurance tool for monitoring putative errors in existing BS repositories and for assisting with a manual strand annotation. MotifAdjuster maximizes the posterior of the parameters of a simple mixture model by considering the possibilities that (i) a sequence being annotated as containing a BS in reality does not contain a BS; (ii) the annotated BS is erroneously shifted by a few base pairs; and (iii) the annotated BS is erroneously located on the false strand and must be reverse complemented. In contrast to existing <it>de</it>-<it>novo </it>motif-discovery algorithms, MotifAdjuster allows the user to specify the probability of finding a BS in a sequence and to specify a nonuniform shift distribution.</p>
         <p>We apply MotifAdjuster to seven data sets of BSs for the TFs CpxR, Crp, Fis, Fnr, Fur, Lrp, and NarL with a total of 536 BSs, and we find 51 BSs proposed for removal and 134 BSs proposed for shifts. In total, this results in 34.5% of the BSs being proposed for adjustments. We choose NarL as an example to scrutinize the proposed reannotations of MotifAdjuster. Checking the original literature for each of the 13 cases shows that the proposed deletions and shifts of MotifAdjuster are in agreement with the published data. Comparing the strand annotation of MotifAdjuster with independent information indicates that the proposals of MotifAdjuster are in accordance with human expertise. Furthermore, MotifAdjuster enables the detection of a novel BS responsible for the regulation of the <it>torCAD </it>operon, finally augmenting experimental evidence of its NarL regulation. MotifAdjuster is an open-source software tool that can be downloaded, extended easily if needed, and used for computational reassessments of BS annotations.</p>
      </sec>
      <sec>
         <st>
            <p>Availability and requirements</p>
         </st>
         <p>Project name: MotifAdjuster, project home page: <abbrgrp><abbr bid="B77">77</abbr></abbrgrp>, operating system(s): platform independent. Programming language: Java 1.5. Requirements: Jstacs 1.2.2. License: GNU General Public License version 3.</p>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>BS: binding site; EM: expectation maximization; ESS: equivalent sample size; MAP: maximum a posteriori; PWM: position weight matrix; TF: transcription factor.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>JK and IG developed the basic idea, and JK implemented MotifAdjuster. JB and TK provided the data. All authors contributed to data analysis, writing, and approved the final manuscript.</p>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>The following additional data are available with the online version of this article. Additional data file <supplr sid="S1">1</supplr> contains a comparison of <it>de-novo </it>motif-discovery tools including MEME, RecursiveSampler, Improbizer, SeSiMCMC, A-GLAM, and MotifAdjuster for the reannotation of NarL. Additional data file <supplr sid="S2">2</supplr> contains a detailed description of the MAP parameter estimators of the model. Additional data file <supplr sid="S3">3</supplr> contains a list of MotifAdjuster results for all seven data sets. Additional data file <supplr sid="S4">4</supplr> contains a list of MotifAdjuster results compared with the original input of CoryneRegNet and RegulonDB for the TF NarL.</p>
         <suppl id="S1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>Comparison of <it>de-novo </it>motif-discovery tools</p>
            </caption>
            <text>
               <p>Comparison of <it>de-novo </it>motif-discovery tools including MEME, RecursiveSampler, Improbizer, SeSiMCMC, A-GLAM, and MotifAdjuster for the reannotation of NarL.</p>
            </text>
            <file name="gb-2009-10-5-r46-S1.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional data file 2</p>
            </title>
            <caption>
               <p>Detailed description of the MAP parameter estimators</p>
            </caption>
            <text>
               <p>Detailed description of the MAP parameter estimators of the model.</p>
            </text>
            <file name="gb-2009-10-5-r46-S2.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S3">
            <title>
               <p>Additional data file 3</p>
            </title>
            <caption>
               <p>List of MotifAdjuster results</p>
            </caption>
            <text>
               <p>List of MotifAdjuster results for all seven data sets.</p>
            </text>
            <file name="gb-2009-10-5-r46-S3.txt">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S4">
            <title>
               <p>Additional data file 4</p>
            </title>
            <caption>
               <p>List of MotifAdjuster results for the TF NarL</p>
            </caption>
            <text>
               <p>List of MotifAdjuster results for the TF NarL compared with the original input of CoryneRegNet and RegulonDB.</p>
            </text>
            <file name="gb-2009-10-5-r46-S4.txt">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Lothar Altschmied, Helmut B&#228;umlein, Karina Brinkrolf, Linda G&#246;tz, Jan Grau, Astrid Junker, Gudrun M&#246;nke, Michaela Mohr, Stefan Posch, Yvonne P&#246;schl, Sven Rahmann, Michael Seifert, Marc Strickert, and Andreas Tauch for helpful discussions, two anonymous reviewers for their valuable comments, Alexander Goesmann, Achim Neumann, and Ralf Nolte for expert technical support, and Richard M&#252;nch for his help with the RegulonDB data. J.B. greatly appreciates the support of the German Academic Exchange Service (DAAD). This work was supported by grant 0312706A by the German Ministry of Education and Research (BMBF) and XP3624HP/0606T by the Ministry of Culture of Saxony-Anhalt.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Evolution of transcription factors and the gene regulatory network in <it>Escherichia coli</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Babu</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Teichmann</snm>
                  <fnm>SA</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>1234</fpage>
            <lpage>1244</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">150228</pubid>
                  <pubid idtype="pmpid" link="fulltext">12582243</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg210</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Transcription factors: structural families and principles of DNA recognition.</p>
            </title>
            <aug>
               <au>
                  <snm>Pabo</snm>
                  <fnm>CO</fnm>
               </au>
               <au>
                  <snm>Sauer</snm>
                  <fnm>RT</fnm>
               </au>
            </aug>
            <source>Annu Rev Biochem</source>
            <pubdate>1992</pubdate>
            <volume>61</volume>
            <fpage>1053</fpage>
            <lpage>1095</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.bi.61.070192.005201</pubid>
                  <pubid idtype="pmpid" link="fulltext">1497306</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Electrophoretic mobility shift assay (EMSA) for detecting protein-nucleic acid interactions.</p>
            </title>
            <aug>
               <au>
                  <snm>Hellman</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Fried</snm>
                  <fnm>MG</fnm>
               </au>
            </aug>
            <source>Nat Protoc</source>
            <pubdate>2007</pubdate>
            <volume>2</volume>
            <fpage>1849</fpage>
            <lpage>1861</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nprot.2007.249</pubid>
                  <pubid idtype="pmpid" link="fulltext">17703195</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>DNAse footprinting: a simple method for the detection of protein-DNA binding specificity.</p>
            </title>
            <aug>
               <au>
                  <snm>Galas</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Schmitz</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1978</pubdate>
            <volume>5</volume>
            <fpage>3157</fpage>
            <lpage>3170</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">342238</pubid>
                  <pubid idtype="pmpid" link="fulltext">212715</pubid>
                  <pubid idtype="doi">10.1093/nar/5.9.3157</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Nonisotopic quantitative analysis of protein-DNA interactions at equilibrium.</p>
            </title>
            <aug>
               <au>
                  <snm>Benotmane</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Hoylaerts</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Collen</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Belayew</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Analyt Biochem</source>
            <pubdate>1997</pubdate>
            <volume>250</volume>
            <fpage>181</fpage>
            <lpage>185</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/abio.1997.2231</pubid>
                  <pubid idtype="pmpid" link="fulltext">9245437</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Seed-specific transcription factors ABI3 and FUS3: molecular interaction with DNA.</p>
            </title>
            <aug>
               <au>
                  <snm>M&#246;nke</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Altschmied</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Tewes</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Reidt</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Mock</snm>
                  <fnm>HP</fnm>
               </au>
               <au>
                  <snm>B&#228;umlein</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Conrad</snm>
                  <fnm>U</fnm>
               </au>
            </aug>
            <source>Planta</source>
            <pubdate>2004</pubdate>
            <volume>219</volume>
            <fpage>158</fpage>
            <lpage>166</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00425-004-1206-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">14767767</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Protein-DNA interaction mapping using genomic tiling path microarrays in <it>Drosophila</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Sun</snm>
                  <fnm>LV</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Greil</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Negre</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Cavalli</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Zhao</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Steensel</snm>
                  <fnm>BV</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>KP</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>9428</fpage>
            <lpage>9433</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">170935</pubid>
                  <pubid idtype="pmpid" link="fulltext">12876199</pubid>
                  <pubid idtype="doi">10.1073/pnas.1533393100</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Fitting a mixture model by expectation maximization to discover motifs in biopolymers.</p>
            </title>
            <aug>
               <au>
                  <snm>Bailey</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Elkan</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Proc Int Conf Intell Syst Mol Biol</source>
            <pubdate>1994</pubdate>
            <volume>2</volume>
            <fpage>28</fpage>
            <lpage>36</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7584402</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment.</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Boguski</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Neuwald</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>Wootton</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1993</pubdate>
            <volume>262</volume>
            <fpage>208</fpage>
            <lpage>214</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.8211139</pubid>
                  <pubid idtype="pmpid" link="fulltext">8211139</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Gibbs Recursive Sampler: finding transcription factor binding sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Rouchka</snm>
                  <fnm>EC</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3580</fpage>
            <lpage>3585</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">169014</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824370</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg608</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Environmentally induced foregut remodeling by PHA-4/FoxA and DAF-12/NHR.</p>
            </title>
            <aug>
               <au>
                  <snm>Ao</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Gaudet</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Muttumu</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mango</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>305</volume>
            <fpage>1743</fpage>
            <lpage>1746</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1102216</pubid>
                  <pubid idtype="pmpid" link="fulltext">15375261</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs with improved estimation of the signal length.</p>
            </title>
            <aug>
               <au>
                  <snm>Favorov</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Gelfand</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Gerasimova</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Ravcheev</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Makeev</snm>
                  <fnm>VJ</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>2240</fpage>
            <lpage>2245</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti336</pubid>
                  <pubid idtype="pmpid" link="fulltext">15728117</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Finding sequence motifs with Bayesian models incorporating positional information: an application to transcription factor binding sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Kim</snm>
                  <fnm>NK</fnm>
               </au>
               <au>
                  <snm>Tharakaraman</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Marino-Ramirez</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Spouge</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2008</pubdate>
            <volume>9</volume>
            <fpage>262</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2432075</pubid>
                  <pubid idtype="pmpid" link="fulltext">18533028</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-9-262</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Integrated analysis and reconstruction of microbial transcriptional gene regulatory networks using CoryneRegNet.</p>
            </title>
            <aug>
               <au>
                  <snm>Baumbach</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wittkop</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kleindt</snm>
                  <fnm>CK</fnm>
               </au>
               <au>
                  <snm>Tauch</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nature Protocols</source>
            <pubdate>2009</pubdate>
            <inpress/>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">19498379</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>PRODORIC: prokaryotic database of gene regulation.</p>
            </title>
            <aug>
               <au>
                  <snm>M&#252;nch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hiller</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Barg</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Heldt</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Linz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wingender</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Jahn</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>266</fpage>
            <lpage>269</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165484</pubid>
                  <pubid idtype="pmpid" link="fulltext">12519998</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg037</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>RegulonDB (version 6.0): gene regulation model of <it>Escherichia coli </it>K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation.</p>
            </title>
            <aug>
               <au>
                  <snm>Gama-Castro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Jim&#233;nez-Jacinto</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Peralta-Gil</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Santos-Zavaleta</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pe&#241;aloza-Spinola</snm>
                  <fnm>MI</fnm>
               </au>
               <au>
                  <snm>Contreras-Moreira</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Segura-Salazar</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mu&#241;iz-Rascado</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Mart&#237;nez-Flores</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Salgado</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bonavides-Mart&#237;nez</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Abreu-Goodger</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Rodr&#237;guez-Penagos</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Miranda-R&#237;os</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Morett</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Merino</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Huerta</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Trevi&#241;o-Quintanilla</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Collado-Vides</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2008</pubdate>
            <volume>36</volume>
            <fpage>D120</fpage>
            <lpage>D124</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2238961</pubid>
                  <pubid idtype="pmpid" link="fulltext">18158297</pubid>
                  <pubid idtype="doi">10.1093/nar/gkm994</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>AGRIS and AtRegNet: a platform to link cis-regulatory elements and transcription factors into regulatory networks.</p>
            </title>
            <aug>
               <au>
                  <snm>Palaniswamy</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>James</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sun</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lamb</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Davuluri</snm>
                  <fnm>RV</fnm>
               </au>
               <au>
                  <snm>Grotewold</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2006</pubdate>
            <volume>140</volume>
            <fpage>818</fpage>
            <lpage>829</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1400579</pubid>
                  <pubid idtype="pmpid" link="fulltext">16524982</pubid>
                  <pubid idtype="doi">10.1104/pp.105.072280</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>AthaMap, integrating transcriptional and post-transcriptional data.</p>
            </title>
            <aug>
               <au>
                  <snm>B&#252;low</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Engelmann</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Schindler</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hehl</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nculeic Acids Res</source>
            <pubdate>2009</pubdate>
            <volume>37</volume>
            <fpage>D983</fpage>
            <lpage>D986</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">18842622</pubid>
                  <pubid idtype="doi">10.1093/nar/gkn709</pubid>
                  <pubid idtype="pmcid">2686474</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>CTCFBSDB: a CTCF-binding site database for characterization of vertebrate genomic insulators.</p>
            </title>
            <aug>
               <au>
                  <snm>Bao</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Cui</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2008</pubdate>
            <volume>36</volume>
            <fpage>D83</fpage>
            <lpage>D87</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2238977</pubid>
                  <pubid idtype="pmpid" link="fulltext">17981843</pubid>
                  <pubid idtype="doi">10.1093/nar/gkm875</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>JASPAR: an open-access database for eukaryotic transcription factor binding profiles.</p>
            </title>
            <aug>
               <au>
                  <snm>Sandelin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Alkema</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Engstr&#246;m</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Wasserman</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Lenhard</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>D91</fpage>
            <lpage>D94</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308747</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681366</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh012</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>ORegAnno: an open access database and curation system for literature-derived promoters, transcription factor binding sites and regulatory variation.</p>
            </title>
            <aug>
               <au>
                  <snm>Montgomery</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Griffith</snm>
                  <fnm>OL</fnm>
               </au>
               <au>
                  <snm>Sleumer</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Bergman</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Bilenky</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pleasance</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Prychyna</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>SJM</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <fpage>637</fpage>
            <lpage>640</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btk027</pubid>
                  <pubid idtype="pmpid" link="fulltext">16397004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>SCPD: a promoter database of the yeast <it>Saccharomyces cerevisiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1999</pubdate>
            <volume>15</volume>
            <fpage>607</fpage>
            <lpage>611</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/15.7.607</pubid>
                  <pubid idtype="pmpid" link="fulltext">10487868</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes.</p>
            </title>
            <aug>
               <au>
                  <snm>Matys</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Kel-Margoulis</snm>
                  <fnm>OV</fnm>
               </au>
               <au>
                  <snm>Fricke</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Liebich</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Land</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Barre-Dirrie</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Reuter</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Chekmenev</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Krull</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hornischer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Voss</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Stegmaier</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lewicki-Potapov</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Saxel</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kel</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Wingender</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>D108</fpage>
            <lpage>D110</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347505</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381825</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj143</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>TRED: a transcriptional regulatory element database, new entries and other development.</p>
            </title>
            <aug>
               <au>
                  <snm>Jiang</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Xuan</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Zhao</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>MQ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <fpage>D137</fpage>
            <lpage>D140</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1899102</pubid>
                  <pubid idtype="pmpid" link="fulltext">17202159</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl1041</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Transcription Regulatory Regions Database (TRRD): its status in 2002.</p>
            </title>
            <aug>
               <au>
                  <snm>Kolchanov</snm>
                  <fnm>NA</fnm>
               </au>
               <au>
                  <snm>Ignatieva</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Ananko</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Podkolodnaya</snm>
                  <fnm>OA</fnm>
               </au>
               <au>
                  <snm>Stepanenko</snm>
                  <fnm>IL</fnm>
               </au>
               <au>
                  <snm>Merkulova</snm>
                  <fnm>TI</fnm>
               </au>
               <au>
                  <snm>Pozdnyakov</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Podkolodny</snm>
                  <fnm>NL</fnm>
               </au>
               <au>
                  <snm>Naumochkin</snm>
                  <fnm>AN</fnm>
               </au>
               <au>
                  <snm>Romashchenko</snm>
                  <fnm>AG</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>312</fpage>
            <lpage>317</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99088</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752324</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.312</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>MATCH: a tool for searching transcription factor binding sites in DNA sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Kel</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>G&#246;ssling</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Reuter</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Cheremushkin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kel-Margoulis</snm>
                  <fnm>OV</fnm>
               </au>
               <au>
                  <snm>Wingender</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3576</fpage>
            <lpage>3579</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">169193</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824369</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg585</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Assessing computational tools for the discovery of transcription factor binding sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Tompa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bailey</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Moor</snm>
                  <fnm>BD</fnm>
               </au>
               <au>
                  <snm>Eskin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Favorov</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Frith</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Fu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Makeev</snm>
                  <fnm>VJ</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Noble</snm>
                  <fnm>WS</fnm>
               </au>
               <au>
                  <snm>Pavesi</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Pesole</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>R&#233;gnier</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Simonis</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sinha</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Thijs</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>van Helden</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Vandenbogaert</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Workman</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ye</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2005</pubdate>
            <volume>23</volume>
            <fpage>137</fpage>
            <lpage>144</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt1053</pubid>
                  <pubid idtype="pmpid" link="fulltext">15637633</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Fast index based algorithms and software for matching position specific scoring matrices.</p>
            </title>
            <aug>
               <au>
                  <snm>Beckstette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Homann</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Giegerich</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kurtz</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>389</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1635428</pubid>
                  <pubid idtype="pmpid" link="fulltext">16930469</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-7-389</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Virtual Footprint and PRODORIC: an integrative framework for regulon prediction in prokaryotes.</p>
            </title>
            <aug>
               <au>
                  <snm>M&#252;nch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hiller</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Grote</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Scheer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Klein</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schobert</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jahn</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>4187</fpage>
            <lpage>4189</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti635</pubid>
                  <pubid idtype="pmpid" link="fulltext">16109747</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Use of the "Perceptron" algorithm to distinguish translational initiation sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Stormo</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Gold</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Ehrenfeucht</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1982</pubdate>
            <volume>10</volume>
            <fpage>2997</fpage>
            <lpage>3010</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">320670</pubid>
                  <pubid idtype="pmpid" link="fulltext">7048259</pubid>
                  <pubid idtype="doi">10.1093/nar/10.9.2997</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Computer methods to locate signals in nucleic acid sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Staden</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1984</pubdate>
            <volume>12</volume>
            <fpage>505</fpage>
            <lpage>519</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">321067</pubid>
                  <pubid idtype="pmpid" link="fulltext">6364039</pubid>
                  <pubid idtype="doi">10.1093/nar/12.1Part2.505</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <aug>
               <au>
                  <snm>Bernardo</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>AFM</fnm>
               </au>
            </aug>
            <source>Bayesian Theory</source>
            <publisher>New York: John Wiley &amp; Sons</publisher>
            <pubdate>1994</pubdate>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Accelerated quantification of Bayesian networks with incomplete data.</p>
            </title>
            <aug>
               <au>
                  <snm>Thiesson</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Proceedings of First International Conference on Knowledge Discovery and Data Mining (KDD-95): August 20-21 1995</source>
            <publisher>Montreal: AAAI Press</publisher>
            <editor>Fayyad U, Uthurusamy R</editor>
            <pubdate>1995</pubdate>
            <fpage>306</fpage>
            <lpage>311</lpage>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Choice of basis for Laplace approximation.</p>
            </title>
            <aug>
               <au>
                  <snm>MacKay</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Machine Learning</source>
            <pubdate>1998</pubdate>
            <volume>33</volume>
            <fpage>77</fpage>
            <lpage>86</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1023/A:1007558615313</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <aug>
               <au>
                  <snm>Heckerman</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>A Tutorial on Learning with Bayesian Networks. Tech. Rep. MSR-TR-95-06, Microsoft Research</source>
            <pubdate>1995</pubdate>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Learning with mixtures of trees.</p>
            </title>
            <aug>
               <au>
                  <snm>Meila</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jordan</snm>
                  <fnm>MI</fnm>
               </au>
            </aug>
            <source>J Machine Learning Res</source>
            <pubdate>2000</pubdate>
            <volume>1</volume>
            <fpage>1</fpage>
            <lpage>48</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1162/153244301753344605</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Reilly</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>Proteins Struct Funct Genet</source>
            <pubdate>1990</pubdate>
            <volume>7</volume>
            <fpage>41</fpage>
            <lpage>51</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.340070105</pubid>
                  <pubid idtype="pmpid">2184437</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>A Bayesian method for the induction of probabilistic networks from data.</p>
            </title>
            <aug>
               <au>
                  <snm>Cooper</snm>
                  <fnm>GF</fnm>
               </au>
               <au>
                  <snm>Herskovits</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Machine Learning</source>
            <pubdate>1992</pubdate>
            <volume>9</volume>
            <fpage>309</fpage>
            <lpage>347</lpage>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Operations for learning with graphical models.</p>
            </title>
            <aug>
               <au>
                  <snm>Buntine</snm>
                  <fnm>WL</fnm>
               </au>
            </aug>
            <source>J Artific Intelligence Res</source>
            <pubdate>1994</pubdate>
            <volume>2</volume>
            <fpage>159</fpage>
            <lpage>225</lpage>
         </bibl>
         <bibl id="B40">
            <aug>
               <au>
                  <snm>Heckerman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Geiger</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chickering</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Learning Bayesian Networks: The Combination of Knowledge and Statistical Data</source>
            <publisher>Tech. rep., Microsoft Research, Redmond, WA: Advanced Technology Division</publisher>
            <pubdate>1995</pubdate>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Maximum likelihood from incomplete data via the EM algorithm.</p>
            </title>
            <aug>
               <au>
                  <snm>Dempster</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Laird</snm>
                  <fnm>NM</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>DB</fnm>
               </au>
            </aug>
            <source>J R Stat Soc Series B</source>
            <pubdate>1977</pubdate>
            <volume>39</volume>
            <fpage>1</fpage>
            <lpage>22</lpage>
         </bibl>
         <bibl id="B42">
            <title>
               <p>A weight array method for splicing signals analysis.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Marr</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1993</pubdate>
            <volume>9</volume>
            <fpage>499</fpage>
            <lpage>509</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8293321</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>On comparing classifiers: pitfalls to avoid and a recommended approach.</p>
            </title>
            <aug>
               <au>
                  <snm>Salzberg</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Data Mining Knowledge Discov</source>
            <pubdate>1997</pubdate>
            <volume>1</volume>
            <fpage>317</fpage>
            <lpage>328</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1023/A:1009752403260</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling.</p>
            </title>
            <aug>
               <au>
                  <snm>Thijs</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Lescot</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Marchal</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Rombauts</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>De Moor</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rouze</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Moreau</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>1113</fpage>
            <lpage>1122</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/17.12.1113</pubid>
                  <pubid idtype="pmpid" link="fulltext">11751219</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Identifying transcription factor binding sites through Markov chain optimization.</p>
            </title>
            <aug>
               <au>
                  <snm>Ellrott</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Sladek</snm>
                  <fnm>FM</fnm>
               </au>
               <au>
                  <snm>Jiang</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>S100</fpage>
            <lpage>S109</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.1.100</pubid>
                  <pubid idtype="pmpid" link="fulltext">12385991</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Modeling dependencies in protein-DNA binding sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Barash</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Elidan</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Kaplan</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Proceedings of Seventh Annual International Conference on Computational Molecular Biology</source>
            <pubdate>2003</pubdate>
            <fpage>28</fpage>
            <lpage>37</lpage>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Splice site identification by idlBNs.</p>
            </title>
            <aug>
               <au>
                  <snm>Castelo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Guigo</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <fpage>i69</fpage>
            <lpage>76</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth932</pubid>
                  <pubid idtype="pmpid" link="fulltext">15262783</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>A universal data compression system.</p>
            </title>
            <aug>
               <au>
                  <snm>Rissanen</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>IEEE Trans Inform Theory</source>
            <pubdate>1983</pubdate>
            <volume>29</volume>
            <fpage>656</fpage>
            <lpage>664</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1109/TIT.1983.1056741</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>The power of amnesia: learning probabilistic automata with variable memory length.</p>
            </title>
            <aug>
               <au>
                  <snm>Ron</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Singer</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Tishby</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Machine Learning</source>
            <pubdate>1996</pubdate>
            <volume>25</volume>
            <fpage>117</fpage>
            <lpage>149</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1023/A:1026490906255</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Context-specific Independence in Bayesian networks.</p>
            </title>
            <aug>
               <au>
                  <snm>Boutilier</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Goldszmidt</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Koller</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proceedings of the Twelfth Annual Conference on Uncertainty in Artificial Intelligence</source>
            <pubdate>1996</pubdate>
            <fpage>115</fpage>
            <lpage>123</lpage>
         </bibl>
         <bibl id="B51">
            <aug>
               <au>
                  <snm>B&#252;hlmann</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Model Selection for Variable Length Markov Chains and Tuning the Context Algorithm</source>
            <publisher>Tech. Rep. 82, Statistics, Zurich: ETH Zentrum</publisher>
            <pubdate>1997</pubdate>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Finding short DNA motifs using permuted Markov models.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhao</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Speed</snm>
                  <fnm>TP</fnm>
               </au>
            </aug>
            <source>J Comput Biol</source>
            <pubdate>2005</pubdate>
            <volume>12</volume>
            <fpage>894</fpage>
            <lpage>906</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1089/cmb.2005.12.894</pubid>
                  <pubid idtype="pmpid" link="fulltext">16108724</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Identification of transcription factor binding sites with variable-order Bayesian networks.</p>
            </title>
            <aug>
               <au>
                  <snm>Ben-Gal</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Shani</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gohr</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Grau</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Arviv</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shmilovici</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Posch</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Grosse</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>2657</fpage>
            <lpage>2666</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti410</pubid>
                  <pubid idtype="pmpid" link="fulltext">15797905</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>Sequence logos: a new way to display consensus sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Schneider</snm>
                  <fnm>TD</fnm>
               </au>
               <au>
                  <snm>Stephens</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1990</pubdate>
            <volume>18</volume>
            <fpage>6097</fpage>
            <lpage>6100</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">332411</pubid>
                  <pubid idtype="pmpid" link="fulltext">2172928</pubid>
                  <pubid idtype="doi">10.1093/nar/18.20.6097</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>WebLogo: a sequence logo generator.</p>
            </title>
            <aug>
               <au>
                  <snm>Crooks</snm>
                  <fnm>GE</fnm>
               </au>
               <au>
                  <snm>Hon</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Chandonia</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Brenner</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <fpage>1188</fpage>
            <lpage>1190</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">419797</pubid>
                  <pubid idtype="pmpid" link="fulltext">15173120</pubid>
                  <pubid idtype="doi">10.1101/gr.849004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Genome-wide profiling of promoter recognition by the two-component response regulator CpxR-P in <it>Escherichia coli</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>De Wulf</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>McGuire</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>ECC</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2002</pubdate>
            <volume>277</volume>
            <fpage>26652</fpage>
            <lpage>26661</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M203487200</pubid>
                  <pubid idtype="pmpid" link="fulltext">11953442</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Phylogeny of the bacterial superfamily of Crp-Fnr transcription regulators: exploiting the metabolic spectrum by controlling alternative gene programs.</p>
            </title>
            <aug>
               <au>
                  <snm>K&#246;rner</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Sofia</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Zumft</snm>
                  <fnm>WG</fnm>
               </au>
            </aug>
            <source>FEMS Microbiol Rev</source>
            <pubdate>2003</pubdate>
            <volume>27</volume>
            <fpage>559</fpage>
            <lpage>592</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-6445(03)00066-4</pubid>
                  <pubid idtype="pmpid">14638413</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Identification of new Fis binding sites by DNA scission with Fis-1,10-phenanthroline-copper(I) chimeras.</p>
            </title>
            <aug>
               <au>
                  <snm>Pan</snm>
                  <fnm>CQ</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Sigman</snm>
                  <fnm>DS</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1996</pubdate>
            <volume>35</volume>
            <fpage>4326</fpage>
            <lpage>4333</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi952040z</pubid>
                  <pubid idtype="pmpid" link="fulltext">8605181</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>Recognition of DNA by Fur: a reinterpretation of the Fur box consensus sequence.</p>
            </title>
            <aug>
               <au>
                  <snm>Baichoo</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Helmann</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2002</pubdate>
            <volume>184</volume>
            <fpage>5826</fpage>
            <lpage>5832</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">135393</pubid>
                  <pubid idtype="pmpid" link="fulltext">12374814</pubid>
                  <pubid idtype="doi">10.1128/JB.184.21.5826-5832.2002</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>A consensus sequence for binding of Lrp to DNA.</p>
            </title>
            <aug>
               <au>
                  <snm>Cui</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Stormo</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Calvo</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1995</pubdate>
            <volume>177</volume>
            <fpage>4872</fpage>
            <lpage>4880</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">177260</pubid>
                  <pubid idtype="pmpid" link="fulltext">7665463</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Primary and secondary modes of DNA recognition by the NarL two-component response regulator.</p>
            </title>
            <aug>
               <au>
                  <snm>Maris</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Kaczor-Grzeskowiak</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ma</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Kopka</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Gunsalus</snm>
                  <fnm>RP</fnm>
               </au>
               <au>
                  <snm>Dickerson</snm>
                  <fnm>RE</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>2005</pubdate>
            <volume>44</volume>
            <fpage>14538</fpage>
            <lpage>14552</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi050734u</pubid>
                  <pubid idtype="pmpid" link="fulltext">16262254</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Alternative respiratory pathways of <it>Escherichia coli</it>: energetics and transcriptional regulation in response to electron acceptors.</p>
            </title>
            <aug>
               <au>
                  <snm>Unden</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bongaerts</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Biochim Biophys Acta</source>
            <pubdate>1997</pubdate>
            <volume>1320</volume>
            <fpage>217</fpage>
            <lpage>234</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0005-2728(97)00034-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">9230919</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>Nitrate repression of the <it>Escherichia coli </it>pfl operon is mediated by the dual sensors NarQ and NarX and the dual regulators NarL and NarP.</p>
            </title>
            <aug>
               <au>
                  <snm>Kaiser</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sawers</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1995</pubdate>
            <volume>177</volume>
            <fpage>3647</fpage>
            <lpage>3655</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">177079</pubid>
                  <pubid idtype="pmpid" link="fulltext">7601827</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B64">
            <title>
               <p>In vitro interaction of nitrate-responsive regulatory protein NarL with DNA target sequences in the fdnG, narG, narK and frdA operon control regions of <it>Escherichia coli </it>K-12.</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kustu</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Stewart</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1994</pubdate>
            <volume>241</volume>
            <fpage>150</fpage>
            <lpage>165</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1994.1485</pubid>
                  <pubid idtype="pmpid" link="fulltext">8057356</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>Differential regulation by the homologous response regulators NarL and NarP of <it>Escherichia coli </it>K-12 depends on DNA binding site arrangement.</p>
            </title>
            <aug>
               <au>
                  <snm>Darwin</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Tyson</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Busby</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Stewart</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <fpage>583</fpage>
            <lpage>595</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.1997.4971855.x</pubid>
                  <pubid idtype="pmpid">9302020</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>Transcriptional regulation and organization of the dcuA and dcuB genes, encoding homologous anaerobic C4-dicarboxylate transporters in <it>Escherichia coli</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Golby</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kelly</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Guest</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Andrews</snm>
                  <fnm>SC</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1998</pubdate>
            <volume>180</volume>
            <fpage>6586</fpage>
            <lpage>6596</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">107762</pubid>
                  <pubid idtype="pmpid" link="fulltext">9852003</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>Analysis of nitrate regulatory protein NarL-binding sites in the fdnG and narG operon control regions of <it>Escherichia coli </it>K-12.</p>
            </title>
            <aug>
               <au>
                  <snm>Darwin</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stewart</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>1996</pubdate>
            <volume>20</volume>
            <fpage>621</fpage>
            <lpage>632</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.1996.5491074.x</pubid>
                  <pubid idtype="pmpid">8736541</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B68">
            <title>
               <p>PRODORIC URL of the Matrix of NarL</p>
            </title>
            <url>http://www.prodoric.de/matrix.php?matrix_acc=MX000003</url>
         </bibl>
         <bibl id="B69">
            <title>
               <p>Microarray analysis of gene regulation by oxygen, nitrate, nitrite, FNR, NarL and NarP during anaerobic growth of <it>Escherichia coli</it>: new insights into microbial physiology.</p>
            </title>
            <aug>
               <au>
                  <snm>Overton</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Griffiths</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Patel</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Hobman</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Penn</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Cole</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Constantinidou</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Biochem Soc Trans</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>104</fpage>
            <lpage>107</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1042/BST0340104</pubid>
                  <pubid idtype="pmpid" link="fulltext">16417494</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B70">
            <title>
               <p>A reassessment of the FNR regulon and transcriptomic analysis of the effects of nitrate, nitrite, NarXL, and NarQP as <it>Escherichia coli </it>K12 adapts from aerobic to anaerobic growth.</p>
            </title>
            <aug>
               <au>
                  <snm>Constantinidou</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hobman</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Griffiths</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Patel</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Penn</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Cole</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Overton</snm>
                  <fnm>TW</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2006</pubdate>
            <volume>281</volume>
            <fpage>4802</fpage>
            <lpage>4815</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M512312200</pubid>
                  <pubid idtype="pmpid" link="fulltext">16377617</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B71">
            <title>
               <p>Identification and characterization of the caiF gene encoding a potential transcriptional activator of carnitine metabolism in <it>Escherichia coli</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Eichler</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Buchet</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lemke</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kleber</snm>
                  <fnm>HP</fnm>
               </au>
               <au>
                  <snm>Mandrand-Berthelot</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1996</pubdate>
            <volume>178</volume>
            <fpage>1248</fpage>
            <lpage>1257</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">177796</pubid>
                  <pubid idtype="pmpid" link="fulltext">8631699</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B72">
            <title>
               <p>The narL gene product activates the nitrate reductase operon and represses the fumarate reductase and trimethylamine N-oxide reductase operons in <it>Escherichia coli</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Iuchi</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>EC</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1987</pubdate>
            <volume>84</volume>
            <fpage>3901</fpage>
            <lpage>3905</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">304984</pubid>
                  <pubid idtype="pmpid">3035558</pubid>
                  <pubid idtype="doi">10.1073/pnas.84.11.3901</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B73">
            <title>
               <p>Complex transcriptional control links NikABCDE-dependent nickel transport with hydrogenase expression in <it>Escherichia coli</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Rowe</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Starnes</snm>
                  <fnm>GL</fnm>
               </au>
               <au>
                  <snm>Chivers</snm>
                  <fnm>PT</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2005</pubdate>
            <volume>187</volume>
            <fpage>6317</fpage>
            <lpage>6323</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1236639</pubid>
                  <pubid idtype="pmpid" link="fulltext">16159764</pubid>
                  <pubid idtype="doi">10.1128/JB.187.18.6317-6323.2005</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B74">
            <title>
               <p>Regulation of the ubiquinone (coenzyme Q) biosynthetic genes ubiCA in <it>Escherichia coli</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Kwon</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Druce-Hoffman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Meganathan</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Curr Microbiol</source>
            <pubdate>2005</pubdate>
            <volume>50</volume>
            <fpage>180</fpage>
            <lpage>189</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00284-004-4417-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">15902464</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B75">
            <title>
               <p>Coordinate regulation of the <it>Escherichia coli </it>formate dehydrogenase fdnGHI and fdhF genes in response to nitrate, nitrite, and formate: roles for NarL and NarP.</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gunsalus</snm>
                  <fnm>RP</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2003</pubdate>
            <volume>185</volume>
            <fpage>5076</fpage>
            <lpage>5085</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">180993</pubid>
                  <pubid idtype="pmpid" link="fulltext">12923080</pubid>
                  <pubid idtype="doi">10.1128/JB.185.17.5076-5085.2003</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B76">
            <title>
               <p>TMAO anaerobic respiration in <it>Escherichia coli</it>: involvement of the tor operon.</p>
            </title>
            <aug>
               <au>
                  <snm>M&#233;jean</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Iobbi-Nivol</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lepelletier</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Giordano</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Chippaux</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pascal</snm>
                  <fnm>MC</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>1994</pubdate>
            <volume>11</volume>
            <fpage>1169</fpage>
            <lpage>1179</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1365-2958.1994.tb00393.x</pubid>
                  <pubid idtype="pmpid">8022286</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B77">
            <title>
               <p>Jstacs: A Java Framework for Statistical Analysis and Classification of Biological Sequences</p>
            </title>
            <url>http://www.jstacs.de</url>
         </bibl>
      </refgrp>
   </bm>
</art>

