LUMPY: a probabilistic framework for structural variant discovery
1 Department of Biochemistry and Molecular Genetics, Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
2 Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
3 Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
Genome Biology 2014, 15:R84 doi:10.1186/gb-2014-15-6-r84Published: 26 June 2014
Comprehensive discovery of structural variation (SV) from whole genome sequencing data requires multiple detection signals including read-pair, split-read, read-depth and prior knowledge. Owing to technical challenges, extant SV discovery algorithms either use one signal in isolation, or at best use two sequentially. We present LUMPY, a novel SV discovery framework that naturally integrates multiple SV signals jointly across multiple samples. We show that LUMPY yields improved sensitivity, especially when SV signal is reduced owing to either low coverage data or low intra-sample variant allele frequency. We also report a set of 4,564 validated breakpoints from the NA12878 human genome. https://github.com/arq5x/lumpy-sv webcite.