Workflow of the pipeline. For each position in the target region, samples having the same consensus nucleotide are used as reference samples. Reads are categorized into different bins according to their positions and orientation when mapped to the reference genome; reads mapped to the questioned sample are also assigned to the same set of bins. By comparing the minor allele count and expected error count, bins are divide into two categories: bins with minor allele count equal or less than the expected error (derived from the reference panel; denoted by a green check mark), and bins with minor allele count greater than the expected error (red cross). Different methods (Poisson, Fisher exact, Empirical) were used to calculate the P-value, which represents the deviation of the observation from expectation under the error model. The P-values are then used to calculate the bias statistic, which is further converted to a Phred-like quality score to represent the uncertainty concerning the minor allele at this position.
Li and Stoneking Genome Biology 2012 13:R34 doi:10.1186/gb-2012-13-5-r34