Sniper: improved SNP discovery by multiply mapping deep sequenced reads
1 Department of Biology, University of Pennsylvania, 433 S. University Ave, Philadelphia, PA 19104, USA
2 Department of Cell and Developmental Biology, University of Pennsylvania, 421 Curie Blvd, Philadelphia, PA 19104, USA
3 Penn Genome Frontiers Institute, University of Pennsylvania, 433 S. University Ave, Philadelphia, PA 19104, USA
Genome Biology 2011, 12:R55 doi:10.1186/gb-2011-12-6-r55Published: 20 June 2011
SNP (single nucleotide polymorphism) discovery using next-generation sequencing data remains difficult primarily because of redundant genomic regions, such as interspersed repetitive elements and paralogous genes, present in all eukaryotic genomes. To address this problem, we developed Sniper, a novel multi-locus Bayesian probabilistic model and a computationally efficient algorithm that explicitly incorporates sequence reads that map to multiple genomic loci. Our model fully accounts for sequencing error, template bias, and multi-locus SNP combinations, maintaining high sensitivity and specificity under a broad range of conditions. An implementation of Sniper is freely available at http://kim.bio.upenn.edu/software/sniper.shtml webcite.