Bis-SNP workflow. (a) Bis-SNP accepts .bam files, produced by a genome mapping tool (BSMAP, MAQ, Novoalign, Bismark, and so on). The local realignment and base quality recalibration steps result in a new BAM with the recalibrated base quality scores. Finally, Bis-SNP performs SNP calling and outputs both methylation levels and SNP calls. (b) The SNP calling step is performed on each genomic position independently. Differences between the reference genome and the sample genome can produce one of 10 possible allele pairs or genotype (G, only 4 shown here). Frequencies of all possible substitutions in the population are taken from the dbSNP database and represented as π(G). A probabilistic model that incorporates prior probabilities for methylation level and bisulfite conversion efficiency is used to calculate the probability of observing the actual bisulfite read data (D) assuming each of the 10 genotypes (Pr(G|D)) Finally, bayesian inference uses the population frequencies of each SNP to calculate the posterior likelihood Pr(D|G).
Liu et al. Genome Biology 2012 13:R61 doi:10.1186/gb-2012-13-7-r61