This article has not been peer reviewed.Deposited research article
The Adaptive Evolution Database (TAED)
1 Department of Chemistry, University of Florida, Gainesville,FL 32611 USA
2 Department of Anatomy and Cell Biology, University of Florida, Gainesville,FL 32611 USA
3 Bioinformatics Division, EraGen Biosciences, 12085 Research Drive, Alachua, FL 32615 USA
4 Current Address: Department of Biochemistry and Biophysics and Stockholm Bioinformatics Center, Stockholm University, 10691 Stockholm, Sweden
5 Current Address: Maxygen, 515 Galveston Drive, Redwood City, CA 94063, USA
Genome Biology 2001, 2:preprint0003-preprint0003.18 doi:10.1186/gb-2001-2-4-preprint0003
This was the first version of this article to be made available publicly. A peer-reviewed and modified version is now avaiable in full at http://genomebiology.com/2001/2/8/research/0028Published: 9 March 2001
Developing an understanding of the molecular basis for the divergence of species lies at the heart of biology. The Adaptive Evolution Database (TAED) serves as a starting point to link events that occur at the same time in the evolutionary history (tree of life) of species, based upon coding sequence evolution analyzed with the Master Catalog. The Master Catalog is a collection of evolutionary models, including multiple sequence alignments, phylogenetic trees, and reconstructed ancestral sequences, for all independently evolving protein sequence modules encoded by genes in GenBank .
We have estimated from these models the ratio of nonsynonymous to synonymous nucleotide substitution (Ka/Ks), for each branch in their respective evolutionary trees of every subtree containing only chordata or only embryophyta proteins. Branches with high Ka/Ks values represent candidate episodes in the history of the family where the protein may have undergone positive selection, a phenomenon in molecular evolution where the mutant form of a gene must have conferred more fitness than the ancestral form. Such episodes are frequently associated with change in function. We have found that an unexpectedly large number of families (between 10 and 20% of those families examined) have at least one branch with a notably high Ka/Ks value (putative adaptive evolution). As a resource for biologists wishing to understand the interaction between protein sequences and the Darwinian processes that shape these sequences, we have collected these into The Adaptive Evolution Database (TAED).
Placed in a phylogenetic perspective, candidate genes that are undergoing evolution at the same time in the same lineage can be viewed together. This framework based upon coding sequence evolution can be readily expanded to include other types of evolution. In its present form, TAED provides a resource for bioinformaticists interested in data mining and for experimental evolutionists seeking candidate examples of adaptive evolution for further experimental study.