Abstract
Water octanol partition coefficient serves as a measure for the lipophilicity of a molecule and is important in the field of drug discovery. A novel method for computational prediction of logarithm of partition coefficient (logP) has been developed using molecular fingerprints and a deep neural network. The machine learning model was trained on a dataset of 12,000 molecules and tested on 2000 molecules. In this article, we present our results for the blind prediction of logP for the SAMPL6 challenge. While the best submission achieved a RMSE of 0.41 logP units, our submission had a RMSE of 0.61 logP units. Overall, we ranked in the top quarter out of the 92 submissions that were made. Our results show that the deep learning model can be used as a fast, accurate and robust method for high throughput prediction of logP of small molecules.
Similar content being viewed by others
References
Kubinyi H (1979) Progress in drug research/Fortschritte Der Arzneimittelforschung/Progrès Des Recherches Pharmaceutiques. Springer, New York pp 97–198
Edwards MP, Price DA (2010) Annual reports in medicinal chemistry. Elsevier, Amsterdam pp 380–391
Arnott JA, Kumar R, Planey SL (2013) J Appl Biopharm Pharmacokinet 1(1):31
Avdeef A, Box K, Comer J, Hibbert C, Tam K (1998) Pharm Res 15(2):209
Efremov RG, Chugunov AO, Pyrkov TV, Priestle JP, Arseniev AS, Jacoby E (2007) Curr Med Chem 14(4):393
Ritchie TJ, Macdonald SJ (2009) Drug Discov Today 14(21–22):1011
Ertl P, Jelfs S (2007) Curr Top Med Chem 7(15):1491
Macías FA, Marín D, Oliveros-Bastidas A, Molinillo JM (2006) J Agric Food Chem 54(25):9357
Ruscoe C (1977) Pestic Sci 8(3):236
Sverdrup LE, Nielsen T, Krogh PH (2002) Environ Sci Technol 36(11):2429
Ghadimi S, Mousavi S Latif, Javani Z (2008) J Enzyme Inhib Med Chem 23(2):213
Riederer M, Daiß A, Gilbert N, Köhle H (2002) J Exp Bot 53(375):1815
KAJiyA K, Ichiba M, Kuwabara M, Kumazawa S, NAKAYAMA T (2001) Biosci Biotechnol Biochem 65(5):1227
Lee CK, Uchida T, Kitagawa K, Yagi A, Kim NS, Goto S (1994) J Pharm Sci 83(4):562
Hori M, Satoh S, Maibach HI, Guy RH (1991) J Pharm Sci 80(1):32
Cross SE, Magnusson BM, Winckle G, Anissimov Y, Roberts MS (2003) J Investig Dermatol 120(5):759
Abla M, Banga A (2013) Int J Cosmet Sci 35(1):19
Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (1997) Adv Drug Deliv Rev 23(1–3):3
Lipinski CA (2004) Drug Discov Today 1(4):337
Guy RH, Potts RO (1993) Am J Ind Med 23(5):711
Hansch C, Björkroth J, Leo A (1987) J Pharm Sci 76(9):663
Liu R, Zhou D (2008) J Chem Inf Model 48(3):542
Lee CK, Uchida T, Kitagawa K, Yagi A, Kim N, Goto S (1994) Biol Pharm Bull 17(10):1421
Grams YY, Alaruikka S, Lashley L, Caussin J, Whitehead L, Bouwstra JA (2003) Eur J Pharm Sci 18(5):329
Nielsen JB, Nielsen F, Sørensen JA (2007) Arch Dermatol Res 299(9):423
Işık M, Levorse D, Mobley DL, Rhodes T, Chodera JD (2019) BioRxiv p 757393
Mobley DL, Wymer KL, Lim NM, Guthrie JP (2014) J Comput Aided Mol Des 28(3):135
Muddana HS, Sapra NV, Fenley AT, Gilson MK (2014) J Comput Aided Mol Des 28(3):277
Yin J, Henriksen NM, Slochower DR, Shirts MR, Chiu MW, Mobley DL, Gilson MK (2017) J Comput Aided Mol Des 31(1):1
Rustenburg AS, Dancer J, Lin B, Feng JA, Ortwine DF, Mobley DL, Chodera JD (2016) J Comput Aided Mol Des 30(11):945
Pracht P, Wilcken R, Udvarhelyi A, Rodde S, Grimme S (2018) J Comput Aided Mol Des 32(10):1139
Prasad S, Huang J, Zeng Q, Brooks BR (2018) J Comput Aided Mol Des 32(10):1191
Bannan CC, Burley KH, Chiu M, Shirts MR, Gilson MK, Mobley DL (2016) J Comput Aided Mol Des 30(11):927
Plante J, Werner S (2018) J Cheminf 10(1):61
Yang P, Chen J, Chen S, Yuan X, Schramm KW, Kettrup A (2003) Sci Total Environ 305(1–3):65
Leo AJ, Hoekman D (2000) Perspect Drug Discov Des 18(1):19
Schroeter TS, Schwaighofer A, Mika S, Laak AT, Suelzle D, Ganzer U, Heinrich N, Müller KR (2007) ChemMedChem 2(9):1265
Ognichenko LN, Kuz’min VE, Gorb L, Hill FC, Artemenko AG, Polischuk PG, Leszczynski J (2012) Mol Inf 31(3–4):273
Ghasemi F, Mehridehnavi A, Fassihi A, Pérez-Sánchez H (2018) Appl Soft Comput 62:251
Popova M, Isayev O, Tropsha A (2018) Sci Adv 4(7):eaap7885
Lusci A, Pollastri G, Baldi P (2013) J Chem Inf Model 53(7):1563
Mayr A, Klambauer G, Unterthiner T, Steijaert M, Wegner JK, Ceulemans H, Clevert DA, Hochreiter S (2018) Chem Sci 9:5441
Hughes TB, Miller GP, Swamidass SJ (2015) ACS Cent Sci 1(4):168
Daylight manual (2009). https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html
Rogers D, Hahn M (2010) J Chem Inf Model 50(5):742
Landrum G et al (2006) Rdkit: Open-source cheminformatics
Card ML, Gomez-Alvarez V, Lee WH, Lynch DG, Orentas NS, Lee MT, Wong EM, Boethling RS (2017) Environ Sci 19(3):203–212
LeCun Y, Bengio Y, Hinton G (2015) Nature 521(7553):436
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) In: 12th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\) 16), pp 265–283
Samplchallenges. samplchallenges/sampl6 (2019). https://github.com/samplchallenges/SAMPL6
Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning. Springer, New York
Wen M, Jiang J, Wang ZX, Wu C (2014) Theor Chem Acc 133(5):1471
Marenich AV, Cramer CJ, Truhlar DG (2009) J Phys Chem B 113(18):6378
Cramer CJ, Truhlar DG (2008) Acc Chem Res 41(6):760
Wang LP, Martinez TJ, Pande VS (2014) J Phys Chem Lett 5(11):1885
Krämer A, Pickard FC, Huang J, Venable RM, Simmonett AC, Reith D, Kirschner KN, Pastor RW, Brooks BR (2019) J Chem Theory Comput 15:3854–3867
Beauchamp KA, Behr JM, Rustenburg AS, Bayly CI, Kroenlein K, Chodera JD (2015) J Phys Chem B 119(40):12912
Yosinski J, Clune J, Bengio Y, Lipson H (2014) Advances in neural information processing systems. Curr Assoc 27:3320–3328
Long M, Zhu H, Wang J, Jordan MI (2017) In: Proceedings of the 34th international conference on machine learning, vol 70, JMLR.org, pp 2208–2217
Pan SJ, Yang Q (2009) IEEE Trans Knowl Data Eng 22(10):1345
Shin HC, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers RM (2016) IEEE Trans Med Imaging 35(5):1285
Habgood MD, Dehkordi LS, Khodr HH, Abbott J, Hider RC et al (1999) Biochem Pharmacol 57(11):1305
Klamt A, Eckert F, Reinisch J, Wichmann K (2016) J Comput Aided Mol Des 30(11):959
König G, Pickard FC, Huang J, Simmonett AC, Tofoleanu F, Lee J, Dral PO, Prasad S, Jones M, Shao Y et al (2016) J Comput Aided Mol Des 30(11):989
Bengio Y (2012) In: Proceedings of ICML workshop on unsupervised and transfer learning, pp 17–36
Acknowledgements
Samarjeet would like to thank the Biochemistry, Cellular and Molecular Biology(BCMB) Program at JHU-SOM for supporting his graduate studies training. We would like to thank the LoBos and Biowulf teams at NIH for providing the high performance computing support to carry out the work. This study was supported by the Intramural Research Program of the National Heart, Lung and Blood Institute.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Prasad, S., Brooks, B.R. A deep learning approach for the blind logP prediction in SAMPL6 challenge. J Comput Aided Mol Des 34, 535–542 (2020). https://doi.org/10.1007/s10822-020-00292-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-020-00292-3