DDIG-in: discriminating between disease-associated and neutral non-frameshifting micro-indels
- Equal contributors
1 School of Informatics, Indiana University Purdue University Indianapolis, 719 Indiana Ave., WK Bldg Suite 319, Indiana 46202, USA
2 Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 410 West 10th Street, HITS Bldg Suite 5000, Indianapolis, Indiana 46202, USA
3 Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
4 Department of Medical and Molecular Genetics, Indiana University School of Medicine, 975 West Walnut Street, MRL Bldg IB130, Indianapolis, Indiana 46202, USA
Genome Biology 2013, 14:R23 doi:10.1186/gb-2013-14-3-r23Published: 13 March 2013
Micro-indels (insertions or deletions shorter than 21 bps) constitute the second most frequent class of human gene mutation after single nucleotide variants. Despite the relative abundance of non-frameshifting indels, their damaging effect on protein structure and function has gone largely unstudied. We have developed a support vector machine-based method named DDIG-in (Detecting disease-causing genetic variations due to indels) to prioritize non-frameshifting indels by comparing disease-associated mutations with putatively neutral mutations from the 1,000 Genomes Project. The final model gives good discrimination for indels and is robust against annotation errors. A webserver implementing DDIG-in is available at http://sparks-lab.org/ddig webcite.