Assembled proportions of bacterial genomes for a simulated metagenome with sequencing errors. 3 × 109 100-nucleotide reads were simulated with sequencing errors (0.25%) from a simulated metagenome containing 1,000 bacterial genomes with proportions following a power law. Having 1,000 genomes with power law proportions makes it impossible to classify sequences with their coverage. This large metagenomic dataset was assembled using distributed de Bruijn graphs and profiled with colored de Bruijn graphs. Highly similar, but different genomes, are likely to be hard to assemble. This figure shows the proportion of each genome that was assembled de novo within the metagenome. Of the bacterial genomes, 88.2% were assembled with a breadth of coverage of at least 80.0%.
Boisvert et al. Genome Biology 2012 13:R122 doi:10.1186/gb-2012-13-12-r122