Multi-locus model for SNP discovery using multiply mapped reads. (a) Distinct SNPs occurring in paralogous genomic regions are largely undetectable using either unique or best-guess read mapping because sequence reads containing a SNP can map to multiple paralogous loci with equal confidence. The figure shows an example with three paralogous loci and possible combinations for the multi-locus genome for up to one SNP in the exact paralogous loci (cases labeled 'No SNP' and '1 SNP'). Sequences in positions other than the exact paralogous loci may have other variations (shown as gray bars). Higher order configurations are possible, as shown in the bottom '2 SNP' case. However, such configurations can only arise through exact parallel mutation or segregating SNPs prior to genomic duplication. (b) Overview of read mapping strategies employed in this study. Quality-blind means that per-base quality estimates (probability that a base call is correct) are not considered during the mapping process.
Simola and Kim Genome Biology 2011 12:R55 doi:10.1186/gb-2011-12-6-r55