Rooting the eutherian tree: the power and pitfalls of phylogenomics
1 Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 4259-B-21 Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan
2 Department of Statistical Modeling, Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106-8569, Japan
3 School of Life Sciences, Fudan University, Handan Road 220#, Shanghai 200433, China
Genome Biology 2007, 8:R199 doi:10.1186/gb-2007-8-9-r199Published: 21 September 2007
Ongoing genome sequencing projects have led to a phylogenetic approach based on genome-scale data (phylogenomics), which is beginning to shed light on longstanding unresolved phylogenetic issues. The use of large datasets in phylogenomic analysis results in a global increase in resolution due to a decrease in sampling error. However, a fully resolved tree can still be wrong if the phylogenetic inference is biased.
Here, in an attempt to root the eutherian tree using genome-scale data with the maximum likelihood method, we demonstrate a case in which a concatenate analysis strongly supports a putatively wrong tree, whereas the total evaluation of separate analyses of different genes grossly reduced the bias of the phylogenetic inference. A conventional method of concatenate analysis of nucleotide sequences from our dataset, which includes a more than 1 megabase alignment of 2,789 nuclear genes, suggests a misled monophyly of Afrotheria (for example, elephant) and Xenarthra (for example, armadillo) with 100% bootstrap probability. However, this tree is not supported by our 'separate method', which takes into account the different tempos and modes of evolution among genes, and instead the basal Afrotheria tree is favored.
Our analysis demonstrates that in cases in which there is great variation in evolutionary features among different genes, the separate model, rather than the concatenate model, should be used for phylogenetic inference, especially in genome-scale data.