Redundancy structure analysis of the human genome. (a) We generated 2 × 106 PE reads randomly and with perfect fidelity (error rate e = 0; divergence p = 0) from the NCBI Homo sapiensgenome. Read sets were generated for four different fragment lengths (250, 500, 750, and 1,000 nucleotides) and four different sequencing read lengths (30, 60, 90, and 120 nucleotides) and mapped to their respective genome using Bowtie, allowing 1, 2, 3, or 3 alignment mismatches depending on read length. Heat map cells represent the percent of sampled reads having more than one valid alignment, averaged over the total number of reads for that data set. The distributions of the number of alignments per read are shown in Additional file 2. (b) Six experimental and four simulated read sets containing both base-call sequencing errors (e ≈ 0.001) and population sequence divergence (p ≈ 0.001) were mapped to the reference human genome. The percent of non-unique alignments is shown for each read set. Percent of loci covered by at least one unique alignment was computed after pooling all experimental or simulated read maps. Nt, nucleotide.
Simola and Kim Genome Biology 2011 12:R55 doi:10.1186/gb-2011-12-6-r55