BoCaTFBS: a boosted cascade learner to refine the binding sites suggested by ChIP-chip experiments
1 Integrated Data Systems Department, Siemens Corporate Research, 755 College Road East, Princeton, New Jersey 08540, USA
2 Department of Molecular, Cellular, and Developmental Biology, KBT 926, 266 Whitney Ave, Yale University, New Haven, Connecticut 06520, USA
3 Department of Molecular Biophysics and Biochemistry, Bass 432A, 266 Whitney Ave, Yale University, New Haven, CT 06520, USA
4 Program in Computational Biology and Bioinformatics, Bass 432A, 266 Whitney Ave, Yale University, New Haven, CT 06520, USA
5 Department of Computer Science, 51 Prospect Street, Yale University, New Haven, Connecticut 06520, USA
Genome Biology 2006, 7:R102 doi:10.1186/gb-2006-7-11-r102Published: 1 November 2006
Comprehensive mapping of transcription factor binding sites is essential in postgenomic biology. For this, we propose a mining approach combining noisy data from ChIP (chromatin immunoprecipitation)-chip experiments with known binding site patterns. Our method (BoCaTFBS) uses boosted cascades of classifiers for optimum efficiency, in which components are alternating decision trees; it exploits interpositional correlations; and it explicitly integrates massive negative information from ChIP-chip experiments. We applied BoCaTFBS within the ENCODE project and showed that it outperforms many traditional binding site identification methods (for instance, profiles).