Open Access Highly Accessed Open Badges Method

Coverage and error models of protein-protein interaction data by directed graph analysis

Tony Chiang12*, Denise Scholtens3, Deepayan Sarkar2, Robert Gentleman2 and Wolfgang Huber1

Author affiliations

1 EMBL, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK

2 Fred Hutchinson Cancer Research Center, Computational Biology Group, Fairview Avenue North, Seattle, WA 98109-1024, USA

3 Northwestern University, Department of Preventive Medicine, N Lake Shore Drive, Chicago, IL 60611-4402, USA

For all author emails, please log on.

Citation and License

Genome Biology 2007, 8:R186  doi:10.1186/gb-2007-8-9-r186

Published: 10 September 2007


Using a directed graph model for bait to prey systems and a multinomial error model, we assessed the error statistics in all published large-scale datasets for Saccharomyces cerevisiae and characterized them by three traits: the set of tested interactions, artifacts that lead to false-positive or false-negative observations, and estimates of the stochastic error rates that affect the data. These traits provide a prerequisite for the estimation of the protein interactome and its modules.