Significance and context
The ATP-binding cassette (ABC) protein superfamily, the largest protein family known, contains both membrane proteins (ABC transporters) and soluble proteins. Each has one or two ATP-binding cassettes or nucleotide-binding folds (NBFs), characterized by a Walker A box (GX4GK (ST), in the single-letter amino-acid code, with alternatives bracketed) motif and a Walker B box ((RK)X3GX3L(hydrophobic)3) motif, separated by an ABC signature sequence ((LIVMIFY)S(SG)GX3(RKA)(LIVMYA) X(LIVFM)(AG)). ABC transporters, in addition, contain two or three hydrophobic integral transmembrane domains (TMDs) with multiple transmembrane á helices and are involved in transport of a broad range of substances across membranes. The four core domains (two NBFs and two core TMDs) of these transporters may be expressed as separate polypeptides (half molecules) or as multidomain proteins (full molecules). They can also, on the basis of sequence, be classified as 'forward' (TMD1-NBF1-TMD2-NBF2) or 'reverse' (NBF1-TMD1-NBF2-TMD2). Here, Sanchez-Fernandez et al. report a complete inventory of ABC proteins in Arabidopsis, the first in a multicellular organism. The ABC complement in the Arabidopsis genome (129 occurrences) seems to be large compared to human and yeast, where 51 and 29 such proteins have identified respectively, but also lacks the genes that encode bonafide cystic fibrosis transmembrane conductance regulator (CFTR), sulfonyl urea receptor (SUR) and heavy metal tolerance factor1 (HMT1) homologs. An unusually high content of half-molecules (49.5 %) was also observed.
Sanchez-Fernandez and colleagues searched The Arabidopsis Information Resource database for proteins containing the ABC signature motif and also Walker A and Walker B motifs. These proteins were then classified into subfamilies according to their homologs from other organisms. The Arabidopsis genome was found to contain 129 open reading frames (ORFs) encoding ABC proteins, of which 103 were intrinsic membrane proteins and 26 were soluble. The 129 proteins were grouped into 12 subfamilies, of which four comprised full molecules, five comprised half molecules and three were soluble proteins. A separate subfamily of non-intrinsic ABC proteins (NAPs) was made to include unclassified proteins. The four subfamilies that comprised full molecules were those of multi-drug resistance protein (MDR, 22 members), multidrug resistance-associated protein (MRP, 15 members), pleiotropic drug-resistance (PDR, 13 members) and the ABC1 homolog (AOH1, 1 member). Notably, the single AOH1 member did not have any yeast homolog.
The half-molecule transporters included subfamilies of the peroxisomal membrane protein (PMP, 2 members), white-brown complex (WBC, 29 members), ABC2 homolog (ATH, 16 members), ABC transporter of the mitochondrion (ATM, 3 members) and the transporter associated with antigen processing (TAP, 3 members). Soluble proteins grouped into RNase L inhibitor (RL1, 2 members), general control non-repressible (GCN, 5 members) and structural maintenance of chromosome (SMC, 4 members) subfamilies. Interestingly, members of SMC had an ABC signature that was absent in other homologs. Phylogenetic comparison of these ABC proteins also confirmed the robustness of their classification.
This work might spur a similar inventory and classification of ABC proteins in other organisms, enabling comparative genomic study to understand the evolution of this family of proteins and also reveal how ABC protein usage and function differs among different organisms. Although the authors speculate that the unusually large number of Arabidopsis ABC-encoding ORFs found is because plants, being sessile, need more cellular detoxifying mechanisms, further characterization will reveal how many of these ORFs actually code for functional proteins.