Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics
Predicting eukaryotic protein secretion without signals
Introduction
Prediction of classical signal peptide-based protein secretion has a long history in bioinformatics, with the earliest methods being published in the 1980's [[1], [2], [3]]. The secretory signal peptide is probably the best known and most well-described protein sorting signal, and the large interest in signal peptide prediction is reflected by the high number of citations to the papers describing the SignalP method [[4], [5], [6]], which has been available online since 1996 and is currently in version 4.1 [7].
SignalP is an example of a signal-based method for protein sorting prediction, where the computational model recognizes the actual sorting signal. The two other approaches are global property-based methods and homology-based methods [8]. Global property-based methods exploit the fact that proteins in different compartments have different physicochemical properties, which is reflected in e.g. different amino acid compositions, especially regarding the surfaces of the proteins [9]. The earliest method for distinguishing between intra- and extracellular proteins based on amino acid and amino acid pair compositions was published in 1994 [10]. Homology-based methods, on the other hand, exploit the fact that proteins tend to stay in the same compartment during the course of evolution, meaning that subcellular location can often be inferred by homology to proteins with known location [11].
However, not all secreted proteins follow the “classical” signal peptide-dependent pathway. An increasing number of eukaryotic proteins have been found to be released without passing the endomembrane system, including proteins with very important functions like cytokines [12]. Such proteins will go undetected by signal peptide-dependent prediction methods such as SignalP.
When attempting to predict which proteins are secreted by unconventional “non-classical” signal peptide-independent routes, especially in eukaryotes, one is faced with two obstacles. First, the signal-based approach is not available, since it is generally not known where in the sequence the signals for secretion occur. Second, the number of experimentally confirmed data from which to build a training set is extremely small.
In bacteria, the situation is different, since there are many more examples known of signal peptide-independent secretion (rarely termed “non-classical” in bacteria). In Gram-negative bacteria, the type I, III, IV, and VI secretion pathways function without signal peptides, and in some cases, there is evidence of N-terminal or C-terminal sorting signals [8,13]. In Gram-positive bacteria, there are also a few known pathways (Wss, holin, and SecA2) [13,14]. This paper will discuss prediction of non-classical secretion in eukaryotes only; prediction in bacteria has been described elsewhere [8,14].
Section snippets
The SecretomeP method
SecretomeP is a method from 2004 [15] for predicting non-classically secreted proteins from Mammalia. It was published by our former colleagues in the Center for Biological Sequence Analysis, which later was transformed into Department of Bio and Health Informatics. SecretomeP 2.0, published in 2005 [16], added the possibility for prediction in Gram-positive and Gram-negative bacteria; the mammalian part was not modified or retrained.
The authors chose a novel way to deal with the two obstacles
Other dedicated methods
Besides SecretomeP, we are aware of five other published methods specifically designed to predict secretion without signal peptides in eukaryotes. These predictive tools have been summarized in Table 1.
Interestingly, all these methods, like SecretomeP, focus on mammalian proteins alone; no method is available for non-mammal eukaryotes. However, none of the papers actually argue for that choice or cite any references showing that non-classical secretion in mammals differs from the process in,
Multi-location predictors
Besides methods that predict whether or not a protein is secreted, there are also several methods available which predict a larger number of subcellular locations, including “secreted” or “extracellular”. Such multi-location predictors could potentially also be used to predict secretion without signal peptides. However, since the majority of secreted proteins have signal peptides, some kind of signal peptide prediction will usually be built into such methods, either implicitly or explicitly. If
A critical re-evaluation of SecretomeP performance
In the years since SecretomeP was first developed a lot more data has become available for protein sequences that are secreted in a non-classical manner. In addition, one common question addressed to the curators of the SecretomeP web service is whether it performs equally well for all eukaryotic sequences as it does for mammalian sequences. As such, an opportunity has presented itself for a critical reevaluation of SecretomeP's performance.
We collected two data sets from UniProt, one with
Discussion
SecretomeP version 1 was, for its time, a bold and innovative suggestion for how to construct a predictor for secretion without signal peptides. It has been cited >800 times according to Google Scholar, and it is still being used extensively. However, its performance, measured on new independent data, is not nearly as good as we thought it would be, and the underlying hypothesis that extracellular proteins share features independent of the secretion pathway must be called into question.
SRTpred
Acknowledgements
The corresponding author is paid by the Technical University of Denmark. The authors (LZ and KS) thank the research commission of the University Hospital Düsseldorf for funding (FOKO 2018-27). The authors wish to thank Krishna Kumar Kandaswamy from the SPRED team for assistance with running the program.
References (46)
On the predictive recognition of signal peptide sequences
Virus Res.
(1985)- et al.
Adaptation of protein surfaces to subcellular location
J. Mol. Biol.
(1998) - et al.
Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies
J. Mol. Biol.
(1994) - et al.
Mechanisms of cytokine secretion: a portfolio of distinct pathways allows flexibility in cytokine activity
Eur. J. Cell Biol.
(2011) - et al.
Secretion and subcellular localizations of bacterial proteins: a semantic awareness issue
Trends Microbiol.
(2009) - et al.
Prediction of human protein function from post-translational modifications and localization features
J. Mol. Biol.
(2002) - et al.
Statistics of local complexity in amino acid sequences and sequence databases
Comput. Chem.
(1993) - et al.
Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes
J. Mol. Biol.
(2001) - et al.
PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization
Trends Biochem. Sci.
(1999) Membrane protein structure prediction: Hydrophobicity analysis and the positive-inside rule
J. Mol. Biol.
(1992)
SecretP: a new method for predicting mammalian secreted proteins
Peptides
SecretP: identifying bacterial secreted proteins by fusing new features into Chou's pseudo-amino acid composition
J. Theor. Biol.
SPRED: a machine learning approach for the identification of classical and non-classical secretory proteins in mammalian genomes
Biochem. Biophys. Res. Commun.
Ranking Gene Ontology terms for predicting non-classical secretory proteins in eukaryotes and prokaryotes
J. Theor. Biol.
Patterns of amino acids near signal-sequence cleavage sites
Eur. J. Biochem.
A new method for predicting signal sequence cleavage sites
Nucleic Acids Res.
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites
Protein Eng.
Improved prediction of signal peptides: SignalP 3.0
J. Mol. Biol.
SignalP 4.0: discriminating signal peptides from transmembrane regions
Nat. Meth.
Predicting secretory proteins with SignalP
Protein sorting prediction
Sequence conserved for subcellular localization
Protein Sci.
Predicting subcellular localization of proteins by bioinformatic algorithms
Cited by (21)
Short peptidoglycan recognition protein 5 modulates immune response to bacteria in Indian major carp, Cirrhinus mrigala
2024, Developmental and Comparative ImmunologyUnconventional Protein Secretion: Interleukin-1β and Fibroblast Growth Factor 2 as Prototypic Examples of Leaderless Secretory Proteins
2022, Encyclopedia of Cell Biology: Volume 1-6, Second EditionMolecular identification of peptidoglycan recognition protein 5 and its functional characterization in innate immunity of large yellow croaker, Larimichthys crocea
2021, Developmental and Comparative ImmunologyCitation Excerpt :Mechanistically, studies have shown that several extracellular proteins, such as fibroblast growth factors (FGFs) found in the extracellular matrix, can be exported without a classical N-terminal signal peptide (Yu et al., 2010). Secretion of proteins without signal peptide is currently known as leaderless secretion or the non-conventional/non-classical secretory pathway (Nielsen et al., 2019). According to subcellular localizing assay conducted in HEK293T cells, LcPGRP5 protein is located in the nuclei and cytoplasm, which is consistent with the results of published literatures (Chang et al., 2009; Li et al., 2013).
Myokines, Measurement, and Technical Considerations
2023, Neuromethods