Global verification of binding site predictions. Verification of motif-protein binding site predictions relative to solved PDB structures. Possible binding sites are ranked based on our predicted binding confidences. The X-axis is the number of sites that are non-binding in PDB that are predicted to be binding. The Y-axis is the number of PDB verified binding sites that are also predicted to be binding. The green and red curve are for our InSite with Prosite and Pfam, respectively, which is tailored to binding site prediction and explicitly models the noise in the different experimental assays. The brown curve is for the DPEA score as in Riley et al. . The gray curve is for the score derived from the parsimony approach of Guimaraes et al. . The black curve is for the integrative approach by Lee et al. . The purple curve is what we expect from random predictions. (a) Result using Prosite motifs. The area under the curve if we normalize both axes to interval [0,1] are 0.680, 0.601, and 0.5 for InSite, DPEA by Riley et al., and random prediction, respectively. (b) Result when we train on Pfam domains and evaluate the PDB binding sites only on Pfam-A domains, as in the protocol of Riley et al. The area under the curve if we normalize both axes to interval [0,1] are 0.786, 0.745, 0.619, and 0.620 for InSite, integrative approach by Lee et al., DPEA by Riley et al., and parsimony approach by Guimaraes et al., respectively.
Wang et al. Genome Biology 2007 8:R192 doi:10.1186/gb-2007-8-9-r192