Web tools for predicting organelle targeting and transit peptide proteolysis sites
ChloroP
TargetP
SignalP
PSORT
Predotar
MitoProt II
ChloroP and TargetP reliably predicted chloroplast targeting but only reliably predicted transit peptide cleavage sites for soluble proteins targeted to the stroma. SignalP (eukaryote settings) accurately predicted the transit peptide cleavage site for soluble proteins targeted to the lumen. SignalP (Gram-negative bacteria settings) reliably predicted peptide cleavage of integral thylakoid proteins inserted into the membrane through the “spontaneous” pathway. The processing sites of more common thylakoid-integral proteins inserted by the signal recognition peptide-dependent pathway were not well predicted by any of the programs. The results suggest the presence of a second thylakoid processing protease that recognizes the transit peptide of integral proteins inserted via the spontaneous mechanism and that this mechanism may be related to the secretory mechanism of Gram-negative bacteria.
Proteins targeted to the thylakoid lumen have bipartite transit peptides. The stromal targeting information is in the amino-proximal portion of the transit peptide, and the carboxyl-proximal portion, similar to signal peptides of secreted proteins in bacteria, contains the information for targeting to the lumen. Two different pathways for targeting to the lumen have been characterized. The first is a Sec-dependent pathway related to the SecYEG export mechanism in bacteria. The second is a ΔpH-dependent mechanism characterized by two conserved sequential arginines in the transit peptide, the Tat (twin arginine translocase) pathway. Proteins targeted to the lumen via the Sec pathway are generally translocated in an unfolded state, whereas proteins imported via the Tat pathway are translocated in a folded state. Proteins imported into the lumen by either pathway are processed by a thylakoid processing protease that removes the carboxyl-proximal portion of the transit peptide.
Proteomics studies using two-dimensional (2D) gel electrophoresis have provided useful insights for soluble proteins of the lumen and stroma as well as peripheral thylakoid proteins and integral thylakoid proteins, but these methods do not allow convenient determination of the amino terminus where blocked, as is common in the chloroplast. In a work, a dataset of 35 nuclear encoded integral thylakoid membrane proteins from A. thaliana, where the sites of secondary amino-terminal processing were explicitly determined by MS, was used to challenge many bioinformatics tools. After initial trials the dataset was expanded to include photosystem (PS) II thylakoid-associated proteins from pea and spinach published previously and a small number of tobacco IMPs, providing a total of 58 nuclear encoded integral thylakoid membrane proteins, each with an experimentally determined amino terminus. The results demonstrate that although some programs (ChloroP, TargetP) reliably predict trafficking to the chloroplast, they variably predict the processing sites of the transit peptides. The results highlight inadequacies of currently available bioinformatics tools for prediction of secondary amino-terminal processing of transit peptides of integral thylakoid proteins and demonstrate the need for improvements in the datasets used to train the algorithms.