Abstract
The rapid spread of multi-drug resistant microbes has lead researchers to discover natural alternative remedies such as antimicrobial peptides (AMPs). In the first line of defense, AMPs display a broad spectrum of potent activity against multi-resistant pathogenic bacteria, viruses, fungi, and even cancer. AMPs can be further characterised into families according to amino acid composition, secondary structure, and function. However, despite recent advancements in rapid computational methods for AMP prediction from various mammalian, aquatic, and terrestrial species, there is limited information regarding their presence, functional roles, and family type from marine macroalgae. In this paper, we present a promising two-tier ensemble of heterogeneous machine learning models that integrates seven well-known machine learning classifiers to predict AMPs from macroalgae. The first tier of the ensemble consists of a suite of binary classifiers that identify AMPs from protein sequence data which are then forwarded to a second-tier multi-class ensemble to characterise their functional family type. The two-tier ensemble was successfully used to identify 39 putative AMP sequences in 12 macroalgae species from three different phyla groups. The approach we describe is not limited to AMPs and can also be applied to search sequence data for other types of proteins.
Similar content being viewed by others
References
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437. https://doi.org/10.1016/j.ipm.2009.03.002
Mahlapuu M, Håkansson J, Ringstad L, Björn C (2016) Antimicrobial peptides: an emerging category of therapeutic agents. Front Cell Infect Microbiol 6:194. https://doi.org/10.3389/fcimb.2016.00194
Hancock RE, Lehrer R (1998) Cationic peptides: a new source of antibiotics. Trends Biotechnol 16(2):82–88. https://doi.org/10.1016/S0167-7799(97)01156-6
Semreen MH, El-Gamal MI, Abdin S, Alkhazraji H, Kamal L, Hammad S, El-Awady F, Waleed D, Kourbaj L (2018) Recent updates of marine antimicrobial peptides. Saudi Pharm J 26(3):396–409. https://doi.org/10.1016/j.jsps.2018.01.001
Lourenço-Lopes C, Fraga-Corral M, Jimenez-Lopez C, Pereira AG, Garcia-Oliveira P, Carpena M, Prieto MA, Simal-Gandara J (2020) Metabolites from macroalgae and its applications in the cosmetic industry: a circular economy approach. Resources 9(9):101. https://doi.org/10.3390/resources9090101
Admassu H, Gasmalla MAA, Yang R, Zhao W (2018) Bioactive peptides derived from seaweed protein and their health benefits: antihypertensive, antioxidant, and antidiabetic properties. J Food Sci 83(1):6–16. https://doi.org/10.1111/1750-3841.14011
Boonsri N, Rudtanatip T, Withyachumnarnkul B, Wongprasert K (2017) Protein extract from red seaweed Gracilaria fisheri prevents acute hepatopancreatic necrosis disease (AHPND) infection in shrimp. J Appl Phycol 29(3):1597–1608. https://doi.org/10.1007/s10811-016-0969-2
Cordeiro RA, Gomes VM, Carvalho AFU, Melo VMM (2006) Effect of proteins from the red seaweed Hypnea musciformis (Wulfen) Lamouroux on the growth of human pathogen yeasts. Braz Arch Biol Technol 49(6):915–921. https://doi.org/10.1590/S1516-89132006000700008
Marra NJ, Stanhope MJ, Wang NK, Sun Q, Bitar PP, Richcards VP, Komissarov A, Rayko M, Kilver S, Stanhope BJ (2019) White shark genome reveals ancient elasmobranch adaptations associated with wound healing and the maintenance of genome stability. Proc Natl Acad Sci 116(10):4446–4455. https://doi.org/10.1073/pnas.1819778116
Sí Kongsstovu, Dahl HA, Gislason H, Homrum E, Jacobsen JA, Flicek P, Mikalsen SO (2020) Identification of male heterogametic sex-determining regions on the Atlantic herring Clupea harengus genome. J Fish Biol 97(1):190–201. https://doi.org/10.1111/jfb.14349
Lopez JV, Kamel B, Medina M, Collins T, Baums IB (2019) Multiple facets of marine invertebrate conservation genomics. Annu Rev Anim Biosci 7:473–497. https://doi.org/10.1146/annurev-animal-020518-115034
Kenny NJ, Francis WR, Rivera-Vicéns RE, Juravel K, de Mendoza A, Díez-Vives C, Lister R, Bezares-Calderón LA, Grombacher L, Roller M, Barlow LD (2020) Tracing animal genomic evolution with the chromosomal-level assembly of the freshwater sponge Ephydatia Muelleri. Nat Commun 11(1):1–11. https://doi.org/10.1038/s41467-020-17397-w
Lee EY, Lee MW, Fulan BM, Ferguson AL, Wong GC (2017) What can machine learning do for antimicrobial peptides, and what can antimicrobial peptides do for machine learning? Interface Focus 7(6):20160153. https://doi.org/10.1098/rsfs.2016.0153
Torres MDT, de la Fuente-Nunez C (2019) Toward computer-made artificial antibiotics. Curr Opin Microbiol 51:30–38. https://doi.org/10.1016/j.mib.2019.03.004
Beltran JA, Del Rio G, Brizuela CA (2020) An automatic representation of peptides for effective antimicrobial activity classification. Comput Struct Biotechnol J 18:455–463. https://doi.org/10.1016/j.csbj.2020.02.002
Meher PK, Sahu TK, Saini V, Rao AR (2017) Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general PseAAC. Sci Rep 7(1):1–12. https://doi.org/10.1038/srep42362
Zarayeneh N, Hanifeloo Z (2020) Antimicrobial peptide prediction using ensemble learning algorithm. arXiv:2005.01714
Khamis AM, Essack M, Gao X, Bajic VB (2015) Distinct profiling of antimicrobial peptide families. Bioinformatics 31(6):849–856. https://doi.org/10.1093/bioinformatics/btu738
Yang P, Hwa Yang Y, Zhou BB, Zomaya AY (2010) A review of ensemble methods in bioinformatics. Curr Bioinform 5(4):296–308. https://doi.org/10.2174/157489310794072508
Cao Y, Geddes TA, Yang JYH (2020) Ensemble deep learning in bioinformatics. Nat Mach Intell 2:500–508. https://doi.org/10.1038/s42256-020-0217-y
Li H (2016) Smile-statistical machine intelligence & learning engine. http://haifengl.github.io/smile. Accessed 06 Aug 18
Min S, Lee B, Yoon S (2017) Deep learning in bioinformatics. Brief Bioinform 18(5):851–869. https://doi.org/10.1093/bib/bbw068
Heaton J (2008) Introduction to neural networks with java. Heaton Research, Inc.
Demuth HB, Beale MH, De Jess O, Hagan MT (2014) Neural network design. Martin Hagan
Liaw A, Wiener M et al (2002) Classification and regression by random forest. R News 2(3):18–22
Freund Y, Schapire R, Abe N (1999) A short introduction to boosting. J Jpn Soc Artif Intell 14(771–780):1612
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat. https://doi.org/10.2307/2699986
Oshiro TM, Perez PS, Baranauskas JA (2012) How many trees in a random forest? In: International workshop on machine learning and data mining in pattern recognition. Springer, pp 154–168. https://doi.org/10.1007/978-3-642-31537-4_13
Suykens JA, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300. https://doi.org/10.1023/A:1018628609742
Van Messem A (2020) Support vector machines: a robust prediction method with applications in bioinformatics. Princ Methods Data Sci 43:391. https://doi.org/10.1016/bs.host.2019.08.003
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185. https://doi.org/10.1080/00031305.1992.10475879
McCallum A, Nigam K et al (1998) A comparison of event models for naive Bayes text classification. In: AAAI-98 workshop on learning for text categorization, Citeseer, vol 752, pp 41–48
Van Erp M, Vuurpijl L, Schomaker L (2002) An overview and comparison of voting methods for pattern recognition. In: Proceedings eighth international workshop on frontiers in handwriting recognition. IEEE, pp 195–200. https://doi.org/10.1109/IWFHR.2002.1030908
Leon F, Floria S-A, Badica C (2017) Evaluating the effect of voting methods on ensemble-based classification. In: 2017 IEEE international conference on innovations in intelligent systems and applications (INISTA). IEEE, pp 1–6. https://doi.org/10.1109/INISTA.2017.8001122
Matthews BW (1975) Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochim Biophys Acta (BBA) Protein Struct 405(2):442–451. https://doi.org/10.1016/0005-2795(75)90109-9
Waghu FH, Barai RS, Gurung P, Idicula-Thomas S (2016) Campr3: a database on sequences, structures and signatures of antimicrobial peptides. Nucleic Acids Res 44(D1):D1094–D1097. https://doi.org/10.1093/nar/gkv1051
Wang G, Li X, Wang Z (2016) Apd3: the antimicrobial peptide database as a tool for research and education. Nucleic Acids Res 44(D1):D1087–D1093. https://doi.org/10.1093/nar/gkv1278
U. Consortium (2019) Uniprot: a worldwide hub of protein knowledge. Nucleic Acids Res 47(D1):D506–D515. https://doi.org/10.1093/nar/gky1049
Veltri D, Kamath U, Shehu A (2018) Deep learning improves antimicrobial peptide recognition. Bioinformatics 34(16):2740–2747. https://doi.org/10.1093/bioinformatics/bty179
Lata S, Sharma B, Raghava G (2007) Analysis and prediction of antibacterial peptides. BMC Bioinform 8(1):263. https://doi.org/10.1186/1471-2105-8-263
Chicco D (2017) Ten quick tips for machine learning in computational biology. BioData Min 10(1):35. https://doi.org/10.1186/s13040-017-0155-3
Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genom 21(1):6. https://doi.org/10.1186/s12864-019-6413-7
Boughorbel S, Jarray F, El-Anbari M (2017) Optimal classifier for imbalanced data using Matthews correlation coefficient metric. PLoS One 12(6):e0177678. https://doi.org/10.1371/journal.pone.0177678
Gorodkin J (2004) Comparing two k-category assignments by a k-category correlation coefficient. Comput Biol Chem 28(5–6):367–374. https://doi.org/10.1016/j.compbiolchem.2004.09.006
Nakai K, Kidera A, Kanehisa M (1988) Cluster analysis of amino acid indices for prediction of protein structure and function. Protein Eng Des Sel 2(2):93–100. https://doi.org/10.1093/protein/2.2.93
Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, Kanehisa M (2008) AAindex: amino acid index database, progress report 2008. Nucleic Acids Res 36(Database issue):D202–5. https://doi.org/10.1093/nar/gkm998
Chung CR, Kuo TR, Wu LC, Lee TY, Horng JT (2020) Characterization and identification of antimicrobial peptides with different functional activities. Brief Bioinform 21(3):1098–1114. https://doi.org/10.1093/bib/bbz043
Jin Y, Hammer J, Pate M, Zhang Y, Zhu F, Zmuda E, Blazyk J (2005) Antimicrobial activities and structures of two linear cationic peptide families with various amphipathic \(\beta \)-sheet and \(\alpha \)-helical potentials. Antimicrob Agents Chemother 49(12):4957–4964. https://doi.org/10.1128/AAC.49.12.4957-4964.2005
Hancock RE (2001) Cationic peptides: effectors in innate immunity and novel antimicrobials. Lancet Infect Dis 1(3):156–164. https://doi.org/10.1016/S1473-3099(01)00092-5
Chou K-C (2005) Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21(1):10–19. https://doi.org/10.1093/bioinformatics/bth466
Li B-Q, Zhang Y-C, Huang G-H, Cui W-R, Zhang N, Cai Y-D (2014) Prediction of aptamer-target interacting pairs with pseudo-amino acid composition. PLoS One 9(1):e86729. https://doi.org/10.1371/journal.pone.0086729
Wang G, Mishra B (2012) The importance of amino acid composition in natural amps: an evolutional, structural, and functional perspective. Front Immunol 3:221. https://doi.org/10.3389/fimmu.2012.00221
Park Y-K, Hahm K-S (2005) Antimicrobial peptides (amps): peptide structure and mode of action. BMB Rep 38(5):507–516. https://doi.org/10.5483/BMBRep.2005.38.5.507
Luiz DP, Almeida JF, Goulart LR, Nicolau-Junior N, Ueira-Vieira C (2017) Heterologous expression of Abaecin peptide from Apis mellifera in Pichia pastoris. Microb Cell Fact 16(1):1–7. https://doi.org/10.1186/s12934-017-0689-6
Shen X, Ye G, Cheng X, Yu C, Altosaar I, Hu C (2010) Characterization of an abaecin-like antimicrobial peptide identified from a Pteromalus puparum cdna clone. J Invertebr Pathol 105(1):24–29. https://doi.org/10.1016/j.jip.2010.05.006
Hazlett L, Wu M (2011) Defensins in innate immunity. Cell Tissue Res 343(1):175–188. https://doi.org/10.1007/s00441-010-1022-4
Delves PJ, Roitt IM (1998) Encyclopedia of immunology. Academic Press, San Diego 9780080547879
de Oliveira Dias R, Franco OL (2015) Cysteine-stabilized \(\alpha \beta \)-defensins: from a common fold to antibacterial activity. Peptides 72:64–72. https://doi.org/10.1016/j.peptides.2015.04.017
Shafee TM, Lay FT, Phan TK, Anderson MA, Hulett MD (2017) Convergent evolution of defensin sequence, structure and function. Cell Mol Life Sci 74(4):663–682. https://doi.org/10.1007/s00018-016-2344-5
Balducci E, Bonucci A, Picchianti M, Pogni R, Talluri E (2011) Structural and functional consequences induced by post-translational modifications in \(\alpha \)-defensins. Int J Pept. https://doi.org/10.13039/100004336
Zhao L, Ericksen B, Wu X, Zhan C, Yuan W, Li X, Pazgier M, Lu W (2012) Invariant gly residue is important for \(\alpha \)-defensin folding, dimerization, and function a case study of the human neutrophil \(\alpha \)-defensin hnp1. J Biol Chem 287(23):18900–18912. https://doi.org/10.1074/jbc.M112.355255
Bleakley S, Hayes M (2017) Algal proteins: extraction, application, and challenges concerning production. Foods 6(5):33. https://doi.org/10.3390/foods6050033
Lin H, Qin S (2014) Tipping points in seaweed genetic engineering: scaling up opportunities in the next decade. Mar Drugs 12(5):3025–3045. https://doi.org/10.3390/md12053025
Righetti PG, Boschetti E (2016) Global proteome analysis in plants by means of peptide libraries and applications. J Proteomics 143:3–14. https://doi.org/10.1016/j.jprot.2016.02.033
Kumagai Y, Miyabe Y, Takeda T, Adachi K, Yasui H, Kishimura H (2019) In silico analysis of relationship between proteins from plastid genome of red alga Palmaria sp. (japan) and angiotensin i converting enzyme inhibitory peptides. Mar Drugs 17(3):190. https://doi.org/10.3390/md17030190
de Carvalho L, Borelli G, Camargo A, de Assis M, de Ferraz S, Fiamenghi M, Jose J, Mofatto L, Nagamatsu S, Persinoti G et al (2019) Bioinformatics applied to biotechnology: a review towards bioenergy research. Biomass Bioenergy 123:195–224. https://doi.org/10.1016/j.biombioe.2019.02.016
Gupta V, Jain M, Reddy C (2017) Macroalgal functional genomics: a missing area. In: Systems biology of marine ecosystems. Springer, pp 3–12. https://doi.org/10.1007/978-3-319-62094-7_1
Pliego-Cortès H, Wijesekara I, Lang M, Bourgougnon N, Bedoux G (2020) Current knowledge and challenges in extraction, characterization and bioactivity of seaweed protein and seaweed-derived proteins. In: Advances in botanical research, vol 95. Elsevier, pp 289–326. https://doi.org/10.1016/bs.abr.2019.11.008
Author information
Authors and Affiliations
Corresponding author
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Caprani, M.C., Healy, J., Slattery, O. et al. Using an Ensemble to Identify and Classify Macroalgae Antimicrobial Peptides. Interdiscip Sci Comput Life Sci 13, 321–333 (2021). https://doi.org/10.1007/s12539-021-00435-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-021-00435-6