Skip to main content
Log in

Monomer structure fingerprints: an extension of the monomer composition version for peptide databases

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Previously a fingerprint based on monomer composition (MCFP) of nonribosomal peptides (NRPs) has been introduced. MCFP is a novel method for obtaining a representative description of NRP structures from their monomer composition in a fingerprint form. An effective screening and prediction of biological activities has been obtained from Norine NRPs database. In this paper, we present an extension of the MCFP fingerprint. This extension is based on adding few columns into the fingerprint; representing monomer clusters, 2D structures, peptide categories, and peptide diversity. All these data have been extracted from the NRP structure. Experiments with Norine NRPs database showed that the extended MCFP, that can be called Monomer Structure FingerPrint (MSFP) produced high prediction accuracy (> 95%) together with a high recall rate (86%) obtained when MSFP was used for prediction and similarity searching. From this study it appeared that MSFP mainly built from monomer composition can substantially be improved by adding more columns representing useful information about monomer composition and 2D structure of NRPs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Newman DJ, Cragg GM (2020) Natural products as sources of new drugs over the nearly four decades from 01/1981 to 09/2019. J Nat Prod 83:770–803. https://doi.org/10.1021/acs.jnatprod.9b01285

    Article  PubMed  CAS  Google Scholar 

  2. Balunas MJ, Kinghorn AD (2005) Drug discovery from medicinal plants. Life Sci 78:431–441. https://doi.org/10.1016/j.lfs.2005.09.012

    Article  PubMed  CAS  Google Scholar 

  3. Harvey AL, Edrada-Ebel R, Quinn RJ (2015) The re-emergence of natural products for drug discovery in the genomics era. Nat Rev Drug Discov 14:111–129. https://doi.org/10.1038/nrd4510

    Article  PubMed  CAS  Google Scholar 

  4. Thomford NE, Senthebane DA, Rowe A, Munro D, Seele P, Maroyi A, Dzobo K (2018) Natural products for drug discovery in the 21st century: innovations for novel drug discovery. Int J Mol Sci. https://doi.org/10.3390/ijms19061578

    Article  PubMed  PubMed Central  Google Scholar 

  5. Liu M, Panda SK, Luyten W (2020) Plant-based natural products for the discovery and development of novel anthelmintics against nematodes. Biomolecules. https://doi.org/10.3390/biom10030426

    Article  PubMed  PubMed Central  Google Scholar 

  6. Miller BR, Gulick AM (2016) Structural biology of nonribosomal peptide synthetases. Methods Mol Biol 1401:3–29. https://doi.org/10.1007/978-1-4939-3375-4_1

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  7. Blin K, Shaw S, Steinke K, Villebro R, Ziemert N, Lee SY, Medema MH, Weber T (2019) antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res 47:W81–W87. https://doi.org/10.1093/nar/gkz310

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  8. Mukherjee S, Stamatis D, Bertsch J, Ovchinnikova G, Katta HY, Mojica A, Chen I-MA, Kyrpides NC, Reddy TBK (2019) Genomes OnLine database (GOLD) vol 7: updates and new features. Nucleic Acids Res 47:D649–D659. https://doi.org/10.1093/nar/gky977

    Article  PubMed  CAS  Google Scholar 

  9. Flissi A, Ricart E, Campart C, Chevalier M, Dufresne Y, Michalik J, Jacques P, Flahaut C, Lisacek F, Leclère V, Pupin M (2020) Norine: update of the nonribosomal peptide resource. Nucleic Acids Res 48:D465–D469. https://doi.org/10.1093/nar/gkz1000

    Article  PubMed  CAS  Google Scholar 

  10. Barley MH, Turner NJ, Goodacre R (2018) Improved descriptors for the quantitative structure-activity relationship modeling of peptides and proteins. J Chem Inf Model 58:234–243. https://doi.org/10.1021/acs.jcim.7b00488

    Article  PubMed  CAS  Google Scholar 

  11. Caboche S, Leclère V, Pupin M, Kucherov G, Jacques P (2010) Diversity of monomers in nonribosomal peptides: towards the prediction of origin and biological activity. J Bacteriol 192:5143–5150. https://doi.org/10.1128/JB.00315-10

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  12. Abdo A, Caboche S, Leclère V, Jacques P, Pupin M (2012) A new fingerprint to predict nonribosomal peptides activity. J Comput Aided Mol Des 26:1187–1194. https://doi.org/10.1007/s10822-012-9608-4

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  13. Nikolova N, Jaworska J (2003) Approaches to measure chemical similarity—a review. QSAR Comb Sci 22:1006–1026. https://doi.org/10.1002/qsar.200330831

    Article  CAS  Google Scholar 

  14. Maldonado AG, Doucet JP, Petitjean M, Fan B-T (2006) Molecular similarity and diversity in chemoinformatics: from theory to applications. Mol Divers 10:39–79. https://doi.org/10.1007/s11030-006-8697-1

    Article  PubMed  CAS  Google Scholar 

  15. Johnson MA, Maggiora GM (1990) Concepts and application of molecular similarity. Wiley, New York

    Google Scholar 

  16. Raymond JW, Willett P (2002) Maximum common subgraph isomorphism algorithms for the matching of chemical structures. J Comput Aided Mol Des 16:521–533. https://doi.org/10.1023/A:1021271615909

    Article  PubMed  CAS  Google Scholar 

  17. Rarey M, Dixon JS (1998) Feature trees: a new molecular similarity measure based on tree matching. J Comput Aided Mol Des 12:471–490. https://doi.org/10.1023/A:1008068904628

    Article  PubMed  CAS  Google Scholar 

  18. Willett P (2006) Similarity-based virtual screening using 2D fingerprints. Drug Discov Today 11:1046–1053. https://doi.org/10.1016/j.drudis.2006.10.005

    Article  PubMed  CAS  Google Scholar 

  19. Leach AR, Gillet VJ (2007) An introduction to chemoinformatics. Springer, Berlin

    Book  Google Scholar 

  20. Kogej T, Engkvist O, Blomberg N, Muresan S (2006) Multifingerprint based similarity searches for targeted class compound selection. J Chem Inf Model 46:1201–1213. https://doi.org/10.1021/ci0504723

    Article  PubMed  CAS  Google Scholar 

  21. Sheridan RP, Miller MD, Underwood DJ, Kearsley SK (1996) Chemical similarity using geometric atom pair descriptors. J Chem Inf Comput Sci 36:128–136. https://doi.org/10.1021/ci950275b

    Article  CAS  Google Scholar 

  22. Sheridan RP, Kearsley SK (2002) Why do we need so many chemical similarity search methods? Drug Discov Today 7:903–911. https://doi.org/10.1016/S1359-6446(02)02411-X

    Article  PubMed  Google Scholar 

  23. Abdo A, Salim N (2009) Similarity-based virtual screening using bayesian inference network: enhanced search using 2D fingerprints and multiple reference structures. QSAR Comb Sci 28:654–663. https://doi.org/10.1002/qsar.200860155

    Article  CAS  Google Scholar 

  24. Xue L, Godden JW, Bajorath J (2000) Evaluation of descriptors and mini-fingerprints for the identification of molecules with similar activity. J Chem Inf Comput Sci 40:1227–1234. https://doi.org/10.1021/ci000327j

    Article  PubMed  CAS  Google Scholar 

  25. Xue L, Stahura FL, Godden JW, Bajorath J (2001) Mini-fingerprints detect similar activity of receptor ligands previously recognized only by three-dimensional pharmacophore-based methods. J Chem Inf Comput Sci 41:394–401. https://doi.org/10.1021/ci000305x

    Article  PubMed  CAS  Google Scholar 

  26. Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315:972–976. https://doi.org/10.1126/science.1136800

    Article  PubMed  CAS  Google Scholar 

  27. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. https://doi.org/10.1021/ci100050t

    Article  CAS  Google Scholar 

  28. O’Boyle NM, Sayle RA (2016) Comparing structural fingerprints using a literature-based similarity benchmark. J Cheminform. https://doi.org/10.1186/s13321-016-0148-0

    Article  PubMed  PubMed Central  Google Scholar 

  29. Arif SM, Holliday JD, Willett P (2009) Analysis and use of fragment-occurrence data in similarity-based virtual screening. J Comput Aided Mol Des 23:655. https://doi.org/10.1007/s10822-009-9285-0

    Article  PubMed  CAS  Google Scholar 

  30. Abdo A, Chen B, Mueller C, Salim N, Willett P (2010) Ligand-based virtual screening using bayesian networks. J Chem Inf Model 50:1012–1020. https://doi.org/10.1021/ci100090p

    Article  PubMed  CAS  Google Scholar 

  31. Bajusz D, Rácz A, Héberger K (2015) Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? J Cheminform 7:20. https://doi.org/10.1186/s13321-015-0069-3

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  32. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. San Francisco, Morgan Kaufmann

    Google Scholar 

  33. Bugmann G (1998) Normalized Gaussian radial basis function networks. Neurocomputing 20:97–110. https://doi.org/10.1016/S0925-2312(98)00027-7

    Article  Google Scholar 

  34. Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874

    Google Scholar 

  35. Swets JA (1988) Measuring the accuracy of diagnostic systems. Science 240:1285–1293. https://doi.org/10.1126/science.3287615

    Article  PubMed  CAS  Google Scholar 

  36. Triballeau N, Acher F, Brabet I, Pin J-P, Bertrand H-O (2005) Virtual screening workflow development guided by the “Receiver Operating Characteristic” curve approach. Application to high-throughput docking on metabotropic glutamate receptor subtype 4. J Med Chem 48:2534–2547. https://doi.org/10.1021/jm049092j

    Article  PubMed  CAS  Google Scholar 

  37. Siegel S, Jr NJC (1988) Nonparametric statistics for the behavioral sciences, 2nd edn. McGraw-Hill, New York

    Google Scholar 

  38. Abdo A, Salim N, Ahmed A (2011) Implementing relevance feedback in ligand-based virtual screening using bayesian inference network. J Biomol Screen 16:1081–1088. https://doi.org/10.1177/1087057111416658

    Article  PubMed  CAS  Google Scholar 

  39. MACCS structural keys. Accelrys, San Diego

Download references

Funding

This work was supported by Lille University, CNRS and Programme national d’aide à l’Accueil en Urgence des Scientifiques en Exil (PAUSE).

Author information

Authors and Affiliations

Authors

Contributions

The research was conducted by mutual contributions of all authors. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Ammar Abdo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (TXT 4 kb)

Supplementary material 2 (PDF 52 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abdo, A., Ghaleb, E., Alajmi, N.K.A. et al. Monomer structure fingerprints: an extension of the monomer composition version for peptide databases. J Comput Aided Mol Des 34, 1147–1156 (2020). https://doi.org/10.1007/s10822-020-00336-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-020-00336-8

Keywords

Navigation