Phylogenetic profiling of the Arabidopsis thaliana proteome: what proteins distinguish plants from other organisms?
1 Department of Energy Plant Research Laboratory, Michigan State University, East Lansing, MI 48824-1312, USA
2 Department of Plant Biology, Michigan State University, East Lansing, MI 48824-1312, USA
3 Delaware Biotechnology Institute, University of Delaware, 15 Innovation Way, Newark, DE 19711, USA
4 Current address: Department of Biology, New York University, 100 Washington Square East, New York, NY 10003, USA
Genome Biology 2004, 5:R53 doi:10.1186/gb-2004-5-8-r53Published: 15 July 2004
The availability of the complete genome sequence of Arabidopsis thaliana together with those of other organisms provides an opportunity to decipher the genetic factors that define plant form and function. To begin this task, we have classified the nuclear protein-coding genes of Arabidopsis thaliana on the basis of their pattern of sequence similarity to organisms across the three domains of life.
We identified 3,848 Arabidopsis proteins that are likely to be found solely within the plant lineage. More than half of these plant-specific proteins are of unknown function, emphasizing the general lack of knowledge of processes unique to plants. Plant-specific proteins that are membrane-associated and/or targeted to the mitochondria or chloroplasts are the most poorly characterized. Analyses of microarray data indicate that genes coding for plant-specific proteins, but not evolutionarily conserved proteins, are more likely to be expressed in an organ-specific manner. A large proportion (13%) of plant-specific proteins are transcription factors, whereas other basic cellular processes are under-represented, suggesting that evolution of plant-specific control of gene expression contributed to making plants different from other eukaryotes.
We identified and characterized the Arabidopsis proteins that are most likely to be plant-specific. Our results provide a genome-wide assessment that supports the hypothesis that evolution of higher plant complexity and diversity is related to the evolution of regulatory mechanisms. Because proteins that are unique to the green plant lineage will not be studied in other model systems, they should be attractive priorities for future studies.