Overview of our automated procedure. Our automated procedure (InSite), which has two main phases, takes as input protein sequences and multiple pieces of evidence on protein-protein interactions and motif-motif interactions. (a) Motifs, downloaded from Prosite or Pfam database, were generated based on conservation in protein sequences. Protein-protein interactions are obtained from a variety of assays, including: a small set of 'reliable' interactions, which recurred in multiple experiments or were verified in low-throughput experiments; a set of interactions from yeast two-hybrid (Y2H) assays; and a set of interactions from the co-affinity precipitation assays of Krogan et al.  and Gavin et al. . (b) The first phase (Figures S2 and S3 in Additional data file 2) uses a Bayesian network to estimate both the motif pair binding affinities and the parameters governing the evidence models of protein-protein interactions (PPI) and motif-motif interactions (MMI), where the model is trained to maximize the likelihood of the input data. Note that the affinity learnt in this phase depends only on the type of motifs, regardless of which protein pair they occur on. (c) In the second phase (Figure S4 in Additional data file 2), we do a protein-specific binding site prediction based on the model learned in the previous phase. For each protein pair, we compute the confidence score for a motif to be the binding site between them. Note that the confidence scores computed here are protein specific and can be different for the same motif depending on the context it appears in.
Wang et al. Genome Biology 2007 8:R192 doi:10.1186/gb-2007-8-9-r192