Abstract
An elastic network model (ENM) represents a molecule as a matrix of pairwise atomic interactions. Rich in coded information, ENMs are hereby proposed as a novel tool for the prediction of the activity of series of molecules, with widely different chemical structures, but a common biological activity. The new approach is developed and tested using a set of 183 inhibitors of serine/threonine-protein kinase enzyme (Plk3) which is an enzyme implicated in the regulation of cell cycle and tumorigenesis. The elastic network (EN) predictive model is found to exhibit high accuracy and speed compared to descriptor-based machine-trained modeling. EN modeling appears to be a highly promising new tool for the high demands of industrial applications such as drug and material design.
Graphic abstract
Similar content being viewed by others
References
Shahlaei M (2013) Descriptor selection methods in quantitative structure–activity relationship studies: a review study. Chem Rev 113:8093–8103. https://doi.org/10.1021/cr3004339
Hansch C (1969) Quantitative approach to biochemical structure-activity relationships. Acc Chem Res 2:232–239. https://doi.org/10.1021/ar50020a002
Hansch C, Leo A (1995) Exploring QSAR: fundamentals and applications in chemistry and biology. American Chemical Society, Washington, DC
Winkler DA (2002) The role of quantitative structure-activity relationships (QSAR) in biomolecular discovery. Brief Bioinform 3:73–86. https://doi.org/10.1093/bib/3.1.73
Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics: volume I: alphabetical listing/volume II: References. Wiley, Weinheim
Sheikhpour R, Sarram MA, Gharaghani S (2017) Constraint score for semi-supervised feature selection in ligand-and receptor-based QSAR on serine/threonine-protein kinase PLK3 inhibitors. Chemom Intell Lab Syst 163:31–40. https://doi.org/10.1016/j.chemolab.2017.02.006
Malek-Khatabi A, Kompany-Zareh M, Gholami S, Bagheri S (2014) Replacement based non-linear data reduction in radial basis function networks QSAR modeling. Chemom Intell Lab Syst 135:157–165. https://doi.org/10.1016/j.chemolab.2014.04.005
Masand VH, Mahajan DT, Nazeruddin GM, Hadda TB, Rastija V, Alfeefy AM (2015) Effect of information leakage and method of splitting (rational and random) on external predictive ability and behavior of different statistical parameters of QSAR model. Med Chem Res 24:1241–1264. https://doi.org/10.1007/s00044-014-1193-8
Perez-Riverol Y, Kuhn M, Vizcaíno JA, Hitz M-P, Audain E (2017) Accurate and fast feature selection workflow for high-dimensional omics data. PLoS One 12:e0189875. https://doi.org/10.1371/journal.pone.0189875
Dussaut JS, Vidal PJ, Ponzoni I, Olivera AC (2018) Comparing multiobjective evolutionary algorithms for cancer data microarray feature selection. In: 2018 IEEE Congress on Evolutionary Computation (CEC) IEEE 8:1–8
Sabharwal S, Nagpal S, Malhotra N, Singh P, Seth K (2018) Analysis of feature ranking techniques for defect prediction in software systems. Quality, IT and business operations. Springer, Singapore, pp 45–56. https://doi.org/10.1109/CEC.2018.8477812
Voršilák M, Svozil D (2017) Nonpher: computational method for design of hard-to-synthesize structures. J Cheminform 9:20. https://doi.org/10.1186/s13321-017-0206-2
Sharma P, Prakash O, Shukla A, Rajpurohit CS, Vasudev PG, Luqman S, Srivastava SK, Pant AB, Khan F (2016) Structure-activity relationship studies on Holy Basil (Ocimum sanctum L.) based flavonoid orientin and its analogue for cytotoxic activity in liver cancer cell line HepG2. Comb Chem High Throughput Screen 19:656–666. https://doi.org/10.2174/1386207319666160709192801
Dixon SL, Duan J, Smith E, Von Bargen CD (2016) AutoQSAR: an automated machine learning tool for best-practice quantitative structure–activity relationship modeling. Future Med Chem 8:1825–1839. https://doi.org/10.4155/fmc-2016-0093
Dixon SL, Smondyrev AM, Knoll EH, Raol SN, Shaw DE, Friesner RA (2006) PHASE: a new engine for pharmacophore perception, 3D QSAR model development, and 3D database screening: 1. Methodology and preliminary results. J Comput Aided Mol Des 20:647–671. https://doi.org/10.1111/j.1747-0285.2006.00384.x
Dixon SL, Smondyrev AM, Rao SN (2006) PHASE: a novel approach to pharmacophore modeling and 3D database searching. Chem Biol Drug Des 67:370–372. https://doi.org/10.1111/j.1747-0285.2006.00384.x
Sun X, Chen L, Li Y, Li W, Liu G, Tu Y, Tang Y (2014) Structure-based ensemble-QSAR model: a novel approach to the study of the EGFR tyrosine kinase and its inhibitors. Acta Pharmacol Sin 35:301. https://doi.org/10.1038/aps.2013.148
Cook RL (2017) Principal components of localization-delocalization matrices: new descriptors for modeling biological activities of organic compounds. Part I: mosquito insecticides and repellents. Struct Chem 28:1525–1535. https://doi.org/10.1007/s11224-017-0998-8
Sumar I, Cook R, Ayers PW, Matta CF (2015) Aromaticity of rings-in-molecules (RIMs) from electron localization–delocalization matrices (LDMs). Phys Scr 91:13001. https://doi.org/10.1088/0031-8949/91/1/013001
Sumar I, Ayers PW, Matta CF (2014) Electron localization and delocalization matrices in the prediction of pKa's and UV-wavelengths of maximum absorbance of p-benzoic acids and the definition of super-atoms in molecules. Chem Phys Lett 612:190–197. https://doi.org/10.1016/j.cplett.2014.08.020
Matta CF (2018) Molecules as networks: a localization-delocalization matrices approach. Comput Theor Chem 1124:1–14. https://doi.org/10.1016/j.comptc.2017.11.018
Matta CF (2014) Modeling biophysical and biological properties from the characteristics of the molecular electron density, electron localization and delocalization matrices, and the electrostatic potential. J Comput Chem 35:1165–1198. https://doi.org/10.1002/jcc.23608
Pan Y, Li T, Cheng J, Telesca D, Zinc J, Jiang J (2016) Nano-QSAR modeling for predicting the cytotoxicity of metal oxide nanoparticles using novel descriptors. RSC Adv 6:25766–25775. https://doi.org/10.1039/C6RA01298A
Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31–36. https://doi.org/10.1021/ci00057a005
Weininger D, Weininger A, Weininger JL (1989) SMILES. 2. Algorithm for generation of unique SMILES notation. J Chem Inf Comput Sci 29:97–101. https://doi.org/10.1021/ci00062a008
Huang TS, Parker RR (1971) Network theory: an introductory course. Addison-Wesley Pub. Co., Reading
Van Dixhoorn JJ, Evans FJ (eds) (1974) Physical structure in systems theory: network approaches to engineering and economics. Academic Press, London
Sneppen K (2014) Models of life. Cambridge University Press, Cambridge
Hu G, Paola L Di, Liang Z, Giuliani A (2017) Comparative study of elastic network model and protein contact network for protein complexes : the hemoglobin case. Biomed Res Int. Article ID 2483264. https://doi.org/10.1155/2017/2483264
Liang Z, Hu G (2016) Protein structure network-based drug design. Mini Rev Med Chem 16:1330–1343. https://doi.org/10.2174/1389557516999160612163350
Hyeok M, Ho B, Ki M (2015) Robust elastic network model: a general modeling for precise understanding of protein dynamics. J Struct Biol 190:338–347. https://doi.org/10.1016/j.jsb.2015.04.007
Hu G, Michielssens S, Moors SLC, Ceulemans A (2012) The harmonic analysis of cylindrically symmetric proteins: a comparison of Dronpa and a DNA sliding clamp. J Mol Graph Model 34:28–37. https://doi.org/10.1016/j.jmgm.2011.12.005
Kmiecik S, Kouza M, Badaczewska-Dawid A, Kloczkowski A, Kolinski A (2018) Modeling of protein structural flexibility and large-scale dynamics: coarse-grained simulations and elastic network models. Int J Mol Sci 19:3496. https://doi.org/10.3390/ijms19113496
Wang WB, Liang Y, Zhang J, Zhang J, Wu Y, Du J, Li Q, Zhu J, Su J (2018) Energy transport pathway in proteins: insights from non-equilibrium molecular dynamics with elastic network model. Sci Rep 8:9487. https://doi.org/10.1038/s41598-018-27745-y
Xia K (2018) Multiscale virtual particle based elastic network model (MVP-ENM) for normal mode analysis of large-sized biomolecules. Phys Chem Chem Phys 20:658–669. https://doi.org/10.1039/C7CP07177A
Dietzen M, Zotenko E, Hildebrandt A, Lengauer T (2012) On the applicability of elastic network normal modes in small-molecule docking. J Chem Inf Model 52:844–56. https://doi.org/10.1021/ci2004847
Soheilifard R, Toussi CA (2016) On the contribution of normal modes of elastic network models in prediction of conformational changes. In: 23rd Iranian Conference on Biomedical engineering and 2016 1st international Iranian conference on biomedical engineering (ICBME), 2016, IEEE, pp 263–266. https://doi.org/10.1109/ICBME.2016.7890968
Toussi CA, Soheilifard R (2017) A better prediction of conformational changes of proteins using minimally connected network models. Phys Biol 13:66013. https://doi.org/10.1088/1478-3975/13/6/066013
Kmiecik S, Kouza M, Dawid AE, Kloczkowski A, Kolinski A (2018) Modeling of protein structural flexibility and large-scale dynamics: coarse-grained simulations and elastic network models. Int J Mol Sci 19:3496. https://doi.org/10.3390/ijms19113496
Kim J, Kim J-G, Yun G, Li P, Kim D (2015) Toward modular analysis of supramolecular protein assemblies. J Chem Theory Comput 11:4260–4272. https://doi.org/10.1021/acs.jctc.5b00329
Townsend PD, Rodgers TL, Glover LC, Korhonen HJ, Richards SA, Colwell LJ, Pohl E, Wilson MR, Hodgson DR, McLeish TC, Cann MJ (2015) The role of protein-ligand contacts in allosteric regulation of the Escherichia coli catabolite activator protein. J Biol Chem 290:22225–22235. https://doi.org/10.1074/jbc.M115.669267
Greener JG, Sternberg MJE (2018) Structure-based prediction of protein allostery. Curr Opin Struct Biol 50:1–8. https://doi.org/10.1016/j.sbi.2017.10.002
Helmke C, Becker S, Strebhardt K (2016) The role of Plk3 in oncogenesis. Oncogene 35:135. https://doi.org/10.1038/onc.2015.105
Dobbins SE, Lesk VI, Sternberg MJE (2008) Insights into protein flexibility: The relationship between normal modes and conformational change upon protein—protein docking. Proc Nat Acad Sci USA 105:10390–10395. https://doi.org/10.1038/onc.2015.105
Atilgan AR, Durell SR, Jernigan RL, Demirel L (2001) Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys J 80:505–15. https://doi.org/10.1016/S0006-3495(01)76033-X
Gilson MK, Liu T, Baitaluk M, Nicola G, Hwang L, Chong J (2015) BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucl Acids Res 44:D1045–D1053. https://doi.org/10.1093/nar/gkv1072
HyperCube, Inc (2020) HyperChem, Canada. https://www.hyper.com/. Accessed Jan 2020
Mauri A, Consonni V, Pavan M, Todeschini R (2006) Dragon software: An easy approach to molecular descriptor calculations. Match 56:237–248
Mahajan S, Sanejouand Y-H (2015) On the relationship between low-frequency normal modes and the large-scale conformational changes of proteins. Arch Biochem Biophys 567:59–65. https://doi.org/10.1016/j.abb.2014.12.020
Tirion MM (1996) Large amplitude elastic motions in proteins from a single-parameter, atomic analysis. Phys Rev Lett 77:1905–1908. https://doi.org/10.1103/PhysRevLett.77.1905
Atilgan AR, Durell SR, Jernigan RL, Demirel M (2001) Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys J 80:505–15. https://doi.org/10.1016/S0006-3495(01)76033-X
The Mathworks. Inc (2020) Matlab, Natick, MA. https://www.mathworks.com/. Accessed Jan 2020
Salt DW, Yildiz N, Livingstone DJ, Tinsley CJ (1992) The use of artificial neural networks in QSAR. Pestic Sci 36:161–170. https://doi.org/10.1002/ps.2780360212
Murtagh F (1991) Multilayer perceptrons for classification and regression. Neurocomputing 2:183–197. https://doi.org/10.1016/0925-2312(91)90023-5
Yuan J, Yu S, Gao S, Gan Y, Zhang Y, Zhang T, Wang Y, Yang L, Shi J, Yao W (2016) Predicting the biological activities of triazole derivatives as SGLT2 inhibitors using multilayer perceptron neural network, support vector machine, and projection pursuit regression models. Chemom Intell Lab Syst 156:166–173. https://doi.org/10.1016/j.chemolab.2016.06.002
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
Czermiński R, Yasri A, Hartsough D (2001) Use of support vector machine in pattern classification: application to QSAR studies. Quant Struct Rel 20:227–240. https://doi.org/10.1002/1521-3838(200110)20:3%3c227:AID-QSAR227%3e3.0.CO;2-Y
Brereton RG, Lloyd GR (2010) Support vector machines for classification and regression. Analyst 135:230–267. https://doi.org/10.1039/B918972F
Byvatov E, Schneider G (2003) Support vector machine applications in bioinformatics. Appl Bioinform 2:67–77 PMID: 15130823
Apté C, Weiss S (1997) Data mining with decision trees and decision rules. Future Gener Comput Syst 13:197–210. https://doi.org/10.1016/S0167-739X(97)00021-6
Zhao Y, Zhang Y (2008) Comparison of decision tree methods for finding active objects. Adv Sp Res 41:1955–1959. https://doi.org/10.1016/j.asr.2007.07.020
Burbidge R, Trotter M, Buxton B, Holden S (2001) Drug design by machine learning: support vector machines for pharmaceutical data analysis. Comput Chem 26:5–14. https://doi.org/10.1016/S0097-8485(01)00094-8
Luo S-T, Cheng B-W (2012) Diagnosing breast masses in digital mammography using feature selection and ensemble methods. J Med Syst 36:569–577. https://doi.org/10.1007/s10916-010-9518-8
Agrafiotis DK, Cedeno W, Lobanov VS (2002) On the use of neural network ensembles in QSAR and QSPR. J Chem Inf Comput Sci 42:903–911. https://doi.org/10.1021/ci0203702
Shinmura S (2016) New theory of discriminant analysis after R. Fisher: advanced research by the feature selection method for microarray data. Springer, New York
Shinmura S (2016) The best model of the swiss banknote data-validation by the 95% ci of coefficients and t-test of discriminant scores. Stat Optim Inf Comput 4:118–131. https://doi.org/10.19139/soic.v4i2.178
Toussi CA, Haddadnia J (2019) Improving protein secondary structure prediction: the evolutionary optimized classification algorithms. Struct Chem 30:1257–1266. https://doi.org/10.1007/s11224-018-1271-5
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning. Springer, New York
Rodriguez JD, Perez A, Lozano JA (2010) Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans Pattern Anal Mach Intell 32:569–575. https://doi.org/10.1109/TPAMI.2009.187
Santos MS, Soares JP, Abreu PH, Araujo H, Santos J (2018) Cross-validation for imbalanced datasets: avoiding overoptimistic and overfitting approaches [research frontier]. IEEE Comput Intell Mag 13(4):59–76. https://doi.org/10.1109/MCI.2018.2866730
Rácz A, Bajusz D, Héberger K (2015) Consistency of QSAR models: correct split of training and test sets, ranking of models and performance parameters. SAR QSAR Env Res 26:683–700. https://doi.org/10.1080/1062936X.2015.1084647
Hasanloei MAV, Sheikhpour R, Sarram MA, Sheikhpour E, Sharifi H (2018) A combined Fisher and Laplacian score for feature selection in QSAR based drug design using compounds with known and unknown activities. J Comput Aided Mol Des 32:375–384. https://doi.org/10.1007/s10822-017-0094-6
Sheikhpour R, Sarram MA, Gharaghani S, Chahooki MAZ (2017) Feature selection based on graph Laplacian by using compounds with known and unknown activities. J Chemometr 31:e2899. https://doi.org/10.1002/cem.2899
Zhang G, Lu Y (2012) Bias-corrected random forests in regression. J Appl Stat 39:151–160. https://doi.org/10.1080/02664763.2011.578621
Ahamed TKS, Rajan VK, Sabira K, Muraleedharan K (2018) QSAR classification-based virtual screening followed by molecular docking studies for identification of potential inhibitors of 5-lipoxygenase. Comput Biol Chem 77:154–166. https://doi.org/10.1016/j.compbiolchem.2018.10.002
Wignall JA, Muratov E, Sedykh A, Guyton KZ, Tropsha A, Rusyn I, Chiu WA (2018) Conditional toxicity value (CTV) predictor: an in silico approach for generating quantitative risk estimates for chemicals. Environ Health Perspect 126:57008. https://doi.org/10.1289/EHP2998
Liu X, Karimi HA (2007) High-throughput modeling and analysis of protein structural dynamics. Brief Bioinform 8:432–445. https://doi.org/10.1093/bib/bbm014
Jiang H, Qiu Y, Hou W, Cheng X, Yim M, Ching WK (2018) Drug side-effect profiles prediction: From empirical risk minimization to structural risk minimization. IEEE/ACM Trans Comput Biol Bioinform. https://doi.org/10.1109/TCBB.2018.2850884
Acknowledgements
The authors are grateful to the two anonymous reviewers for their excellent suggestions and corrections that helped improve this paper. C.A.T. is grateful to the Ministry of Science, Research and Technology (MSRT), Iran. C.F.M. acknowledges the Natural Sciences and Engineering Research Council of Canada (NSERC), the Canada Foundation for Innovation (CFI), and Mount Saint Vincent University for funding.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Toussi, C.A., Haddadnia, J. & Matta, C.F. Drug design by machine-trained elastic networks: predicting Ser/Thr-protein kinase inhibitors’ activities. Mol Divers 25, 899–909 (2021). https://doi.org/10.1007/s11030-020-10074-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11030-020-10074-6