Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Open Access Highly Accessed Open Badges Research

Phylogenetic profiling of the Arabidopsis thaliana proteome: what proteins distinguish plants from other organisms?

Rodrigo A Gutiérrez124*, Pamela J Green3, Kenneth Keegstra12 and John B Ohlrogge2

Author Affiliations

1 Department of Energy Plant Research Laboratory, Michigan State University, East Lansing, MI 48824-1312, USA

2 Department of Plant Biology, Michigan State University, East Lansing, MI 48824-1312, USA

3 Delaware Biotechnology Institute, University of Delaware, 15 Innovation Way, Newark, DE 19711, USA

4 Current address: Department of Biology, New York University, 100 Washington Square East, New York, NY 10003, USA

For all author emails, please log on.

Genome Biology 2004, 5:R53  doi:10.1186/gb-2004-5-8-r53

Published: 15 July 2004



The availability of the complete genome sequence of Arabidopsis thaliana together with those of other organisms provides an opportunity to decipher the genetic factors that define plant form and function. To begin this task, we have classified the nuclear protein-coding genes of Arabidopsis thaliana on the basis of their pattern of sequence similarity to organisms across the three domains of life.


We identified 3,848 Arabidopsis proteins that are likely to be found solely within the plant lineage. More than half of these plant-specific proteins are of unknown function, emphasizing the general lack of knowledge of processes unique to plants. Plant-specific proteins that are membrane-associated and/or targeted to the mitochondria or chloroplasts are the most poorly characterized. Analyses of microarray data indicate that genes coding for plant-specific proteins, but not evolutionarily conserved proteins, are more likely to be expressed in an organ-specific manner. A large proportion (13%) of plant-specific proteins are transcription factors, whereas other basic cellular processes are under-represented, suggesting that evolution of plant-specific control of gene expression contributed to making plants different from other eukaryotes.


We identified and characterized the Arabidopsis proteins that are most likely to be plant-specific. Our results provide a genome-wide assessment that supports the hypothesis that evolution of higher plant complexity and diversity is related to the evolution of regulatory mechanisms. Because proteins that are unique to the green plant lineage will not be studied in other model systems, they should be attractive priorities for future studies.