Fast and efficient taxonomic profiling with distributed colored de Bruijn graphs. From a previous study, 124 metagenomic samples containing short paired reads were assembled de novo and profiled for taxons. The graph coloring occurred once the de Bruijn graph was assembled de novo. (A) The taxonomic profiles are shown for the phylum level. The two most abundant phyla were Firmicutes and Bacteroidetes. This is in agreement with the literature . The abundance of human sequences was also measured. The phylum Chordata had two outlier samples. This indicates that two of the samples had more human sequences than the average, which may bias results. (B) At the genus level, the most abundant taxon was Bacteroides. This taxon occurred more than once because it was present at different locations within the Greengenes taxonomic tree. Also abundant is the genus Prevotella. Furthermore, the later had numerous samples with higher counts, which may help in non-parametric clustering. Two samples had higher abundance of human sequences, as indicated by the abundance of the genus Homo.
Boisvert et al. Genome Biology 2012 13:R122 doi:10.1186/gb-2012-13-12-r122