Binding site predictions within the Pol II complex. (a) A schematic illustration of interactions within the Pol II complex revealed by its three-dimensional structure. Each circle with number k corresponds to the protein 'Rpbk' (for example, Rpb1). (b) One of our top predictions is 'Pfam-A domain PF01096 on Rpb9 binds to Rpb1'. Both Rpb9 and Rpb1 are part of the co-crystallized Pol II complex in PDB (ID: 1I50). Rpb9 is shown as the light green chain with the surface accessible area of the domain rendered in white; Rpb1 is shown as the light orange chain with its residues that are in contact with the domain shown in orange, which verifies our prediction. (c) Binding site predictions for interactions involving Rpb10. A red arrow connects a motif to a protein it binds to as revealed by its three-dimensional structure. A dashed black arrow represents a non-binding site. The numbers on the arrow are the ranks based on our predicted binding confidences. We assigned confidence values to a total of 123 motif-protein pairs in this complex. In this case, all six PDB verified binding sites (red arrows) are ranked among the top half, while all five non-binding sites have low confidence values with ranks below 100. (d) ROC curve for our motif-protein binding sites predictions within the Pol II complex. There are 123 possible binding sites within the complex that involve the Pfam-A domains in our dataset, out of which 68 (55.3%) are actually binding according to its three-dimensional structure. The possible binding sites are ranked by our predicted binding confidences. The X-axis is the number of non-binding sites within the complex that are predicted to be binding. The Y-axis is the number of PDB verified binding sites that are also predicted to be binding. The purple line is what we expect by chance.
Wang et al. Genome Biology 2007 8:R192 doi:10.1186/gb-2007-8-9-r192