Significance and context
The chloroplast and the molecular machinery responsible for photosynthesis have been the subject of intense study for decades. Peltier et al. have taken advantage of the extraordinary amount of biochemistry that has been done on the chloroplast and the thylakoid membrane, and combined it with some cutting-edge tools to initiate a new study of chloroplast proteins and targeting sequences. They demonstrate how protein purification techniques, mass spectroscopy, protein sequencing and the ever-expanding DNA and protein sequence databases can be combined in a new approach to an old problem. It has been estimated that the plant chloroplast contains between 2,000 and 5,000 proteins, although the chloroplast genome encodes only about 120 proteins. A majority of the proteins that comprise the chloroplast are therefore encoded by the nuclear genome and must be post-translationally targeted to the chloroplast. Whereas the presequences that constitute the amino-terminal transit peptides of these proteins can be predicted with some confidence, the degeneracy of these sequences precludes a PCR-based approach to identifying all chloroplast-localized proteins. Hence a proteomic approach is called for. Recent improvements in protein solubilization techniques and two-dimensional gel electrophoresis now enable the resolution of up to 2,000 proteins on a single two-dimensional gel. Using computer-aided image analysis, two-dimensional gels can be aligned and protein spots of interest can be subjected to further analysis.
Peltier et al. isolated pea chloroplasts and, through a series of purification steps, isolated lumenal and peripheral thylakoid proteins. A series of two-dimensional gels, using both low pH and high pH ranges, was used to improve the resolution of the resulting two-dimensional electrophoresis maps. A total of between about 820 and 920 protein spots can be detected using these methods, which, after adjusting for protein isoforms, post-translational modifications and proteolysis, represent approximately 200 proteins. The authors used three techniques to analyze protein spots: matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) mass spectroscopy, electrospray ionization tandem (ESI) mass spectroscopy and amino-terminal Edman sequencing. The initial analysis, conducted on 400 spots, consisted of an in-gel protease digestion of the proteins followed by MALDI-TOF mass spectroscopy. The list of measured peptide masses produced by this method was compared with the theoretical masses from predicted tryptic peptides for each entry in the public sequence databases. For proteins not unambiguously identified by this method, peptide sequence tags were obtained by ESI mass spectroscopy or Edman sequencing and used for homology-based searches. Peltier et al. successfully identified 61 proteins, and for 33 of these a clear function could be assigned. Of the remainder, 10 had no known function, and for the remaining 18 proteins, no expressed sequence tags or full-length genes were identified.
The study also examined the predictive power of several programs designed to identify protein-sorting signals. The authors conclude that the PSORT and ChloroP programs are not ideal, the former being too conservative and the latter resulting in too many false positives. The lumenal transit peptides for 26 proteins were determined and found to be similar to those of signal peptides in bacteria. Peltier et al. point out several conserved features in these transit peptides, and suggest that existing programs may be adapted to allow prediction of lumenal transit peptides in a global manner with high confidence.
It is not entirely clear why the authors chose to use pea as the source of their thylakoid membranes rather than Arabidopsis, especially considering the sensitivity of the methods being used. As the authors themselves point out, a single mismatch between a pea peptide and a corresponding Arabidopsis sequence is enough to prevent a match at the 50 parts per million mass resolution cut-off being used. Although it is more difficult to isolate chloroplasts from Arabidopsis than from pea, the vast amount of Arabidopsis sequence information available should make that effort worthwhile. But pea chloroplasts have been the subject of photosynthesis research for many years and there is a surprising amount of peptide information in the public databases. The researchers were able to identify almost a third of the estimated 200 proteins and gather important data on lumenal transit peptides. Clearly this is only one of the first of many proteomics papers that we can expect. Many processes are regulated exclusively at the protein level and it is clear that this type of approach, with the ability to measure protein abundance as well as to examine post-translational modifications, will become increasingly useful.