The QSAR similarity principle in the deep learning era: Confirmation or revision?

Gini, Giuseppina

doi:10.1007/s10698-020-09380-6

The QSAR similarity principle in the deep learning era: Confirmation or revision?

Published: 15 July 2020

Volume 22, pages 383–402, (2020)
Cite this article

Foundations of Chemistry Aims and scope Submit manuscript

Giuseppina Gini ORCID: orcid.org/0000-0002-0334-420X¹

358 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Structure–activity relationship (SAR) and quantitative SAR (QSAR) are modeling methods largely used in assessing biological properties of chemical substances. QSAR is based on the hypothesis that the chemical structure is responsible for the activity; it follows that similar molecules are expected to have similar properties. Similarity plays an important role in read across, which categorizes molecules primarily on the basis of similarity. Similarity, and chemical similarity too, is a property differently perceived by humans. The various proposed metrics often disagree with human judgment, and no a unique metric for chemical similarity is universally adopted. Researchers argued that categorization is not only explained by similarity but depends as well on abstract knowledge and the task to accomplish. Moreover, similarity cannot be the unique explanation of a categorization, as different perceptual processes take place in category formation. Assuming that similarity judgments are deeply rooted in human knowledge and perception, cognitive sciences contributions are as important as the mathematical considerations of the classical theories. After an excursus in the many views of similarity in philosophy, mathematics, and cognitive science, the paper explores how connectionist systems, which loosely mimic the human cognitive system, could improve similarity-based choices. A case study on building (Q)SARs using connectionism and deep neural networks shows the role of similarity in building and explaining those models. A discussion about deep learning for QSAR and as a modeling tool for science concludes the presentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Artificial intelligence to deep learning: machine intelligence approach for drug discovery

Article 12 April 2021

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

References

Abrantes, P.: Analogical reasoning and modeling in the sciences. Found Sci 4, 237–270 (1999)
Google Scholar
Barbes, J. (ed.): The Complete Works of Aristotle Bollingen Series LXXI 2, 6th edn. Princeton University Press, Princeton (1995)
Google Scholar
Basak, S.C.: Philosophy of mathematical chemistry: a personal perspective. HYLE Int. J. Philos. Chem. 19(1), 3–17 (2013)
Google Scholar
Bechberger, L., Kuehnberger, K.-U.: Towards grounding conceptual spaces in neural representations. In: Proceedings of Twelveth International Workshop on Neural-Symbolic Learning and Reasoning, London, UK (2017)
Benfenati, E., et al.: Results of a round-robin exercise on read-across. SAR QSAR Environ. Res. 27(5), 371–384 (2016)
Google Scholar
Benfenati, E., et al.: A large comparison of integrated SAR/QSAR models of the Ames test for mutagenicity. SAR QSAR Environ. Res. 29(8), 591–611 (2018)
Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. PAMI 35(8), 1798–1828 (2013)
Google Scholar
Benigni, R., Bossa, C.: Structure alerts for carcinogenicity, and the Salmonella assay system: a novel insight through the chemical relational databases technology. Mutat. Res. 659(3), 248–261 (2008)
Google Scholar
Bernal, A., Daza, E.: Metabolic networks: beyond the graph. Curr. Comput. Aid Drug 7(2), 122–132 (2011)
Google Scholar
Buckner, C., Garson, J.: Connectionism. The Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/archives/fall2019/entries/connectionism/ (2019). Accessed 28 June 2020
Carnap, R.: The logical structure of the world. University of California Press , Berkeley (1928–1967)
Chakravarti, S.K., Saiakhov, R.D.: Computing similarity between structural environments of mutagenicity alerts. Mutagenesis 34(1), 55–65 (2019)
Google Scholar
Cichy, R.M., Kaiser, D.: Deep neural networks as scientific models. Trends Cognit. Sci. 23(4), 305–317 (2019)
Google Scholar
Cooper, J.M., Hutchinson, D.S. (eds.): Plato. Complete works. Hackett Publ. Co., Indianapolis (1997)
Google Scholar
Decock, L., Douven, I.: Similarity after goodman. Rev Philos Psychol 2, 61–75 (2011)
Google Scholar
Floris, M., Manganaro, A., Nicolotti, O., Medda, R., Mangiatordi, G.F., Benfenati, E.: A generalizable definition of chemical similarity for read-across. J. Cheminform. 6, 39 (2014)
Google Scholar
Fodor, J.A.: LOT 2: The Language of Thought Revisited. Oxford University Press, Oxford (2008)
Google Scholar
Frankel, L.: Leibniz’s Principle of Identity of Indiscernibles. Studia Leibnitiana Bd. 13, H. 2, pp 192–211 (1981)
Gärdenfors, P.: Conceptual spaces: the geometry of thought. MIT Press, Cambridge (2000)
Google Scholar
Giere, R.N.: Using models to represent reality. In: Magnani, L., Nersessian, N.J., Thagard, P. (eds.) Model-Based Reasoning in Scientific Discovery, pp. 41–57. Springer, Boston (1999)
Google Scholar
Gini, G.: QSAR methods. In: Benfenati, E. (ed.) In Silico Methods for Predicting Drug Toxicity, pp. 1–20. Springer, Clifton (2016)
Google Scholar
Gini, G.: QSAR, what else? In: Nicolotti, O. (ed.) Computational Toxicology: Methods and Protocols, vol. 1800, pp. 79–105. Springer, Clifton (2018)
Google Scholar
Gini, G., Katrizky, A. (eds.): Predictive Toxicology of Chemicals: Experiences and Impact of AI Tools. SS-99-01. AAAI Press, Menlo Park (1999)
Google Scholar
Gini, G., Zanoli, F.: Machine learning and deep learning methods in ecotoxicological QSAR modeling. In: Roy, K. (ed.) Ecotoxicological QSARs. Springer, Berlin (2020)
Google Scholar
Gini, G., Ferrari, T., Cattaneo, D., Bakhtyari, N.G., Manganaro, A., Benfenati, E.: Automatic knowledge extraction from chemical structures: the case of mutagenicity prediction. SAR QSAR Environ. Res. 24(5), 365–383 (2013)
Google Scholar
Gini, G., Franchi, A.M., Manganaro, A., Golbamaki, A., Benfenati, E.: ToxRead: a tool to assist in read across and its use to assess mutagenicity of chemicals. SAR QSAR Environ. Res. 25(12), 999–1011 (2014)
Google Scholar
Gini, G., Zanoli, F., Gamba, A., Raitano, G., Benfenati, E.: Could deep learning in neural networks improve the QSAR models? SAR QSAR Environ. Res. 30(9), 617–642 (2019)
Google Scholar
Goh, G., Siegel, C., Vishnu, A., Hodas, N. O., Baker, N.: Chemception: A deep neural network with minimal chemistry knowledge matches the performance of expert-developed QSAR/QSPR models. https://arxiv.org/abs/1706.066892017 (2017)
Goh, G., Hodas, N., Siegel, C., Vishnu, A.: SMILES2vec: an interpretable general-purpose deep neural network for predicting chemical properties. https://arxiv.org/abs/1712.02034v2 [stat.ML] (2018)
Goldstone, R.L., Son, J.Y.: Similarity. In: Holyoak, Morrison (ed.) The Cambridge Handbook of Thinking and Reasoning, pp. 13–36. Cambridge University Press, Cambridge (2005)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT Press, Boston (2016)
Google Scholar
Goodman, N.: Seven strictures on similarity. In: Goodman, N. (ed) Problems and Projects, pp. 437–446. Bobbs-Merrill, Indianapolis/New York (1972)
Google Scholar
Hahn, U., Chater, N.: Concepts and similarity. In: Lamberts, L., Shanks, D. (eds.) Knowledge, Concepts, and Categories. Psychology Press/MIT Press, Hove (1997)
Google Scholar
Hampton, J.A.: Typicality, graded membership, and vagueness. Cognit. Sci. 31, 355–384 (2007)
Google Scholar
Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991)
Google Scholar
Jacob, E.K.: Classification and categorization: a difference that makes a difference. Univ. Ill. Libr. Trends 52(3), 515–540 (2004)
Google Scholar
Johnson, A.M., Maggiora, G.M.: Concepts and Applications of Molecular Similarity. Willey, New York (1990)
Google Scholar
Kirkpatrik, P., Ellis, C.: Chemical space. Nature 32(16), 823 (2004)
Google Scholar
Kitcher, P.: The Advancement of Science: Science Without Legend. Oxford University Press, Objectivity Without Illusions (1993)
Google Scholar
Kubinyi, H.: Chemical similarity and biological activities. J. Braz. Chem. Soc. 13(6), (2002)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Google Scholar
Lipkus, A.H., Yuan, Q., Lucas, K.A., Funk, S.A., Bartelt III, W.F., Schenck, R.J., Trippe, A.J.: Structural diversity of organic chemistry. A scaffold analysis of the CAS registry. J. Org. Chem. 73, 4443–4451 (2008)
Google Scholar
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. In: Proceedings of IJCAI-16, pp. 2873–2879 (2016)
Maggiora, G.M.: On outliers and activity cliffs—why QSAR often disappoints. J. Chem. Inf. Model. 46(4), 1535 (2006)
Google Scholar
Maggiora, G., Vogt, M., Stumpfe, D., Bajorat, J.: Similarity in medicinal chemistry. J. Med. Chem. 57(8), 3186–3204 (2014)
Google Scholar
Marquis, J. P.: Category theory. The Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/archives/sum2020/entries/category-theory/ (2020). Accessed 28 June 2020
Martin, Y.C., Kofron, J.L., Traphagen, L.M.: Do structurally similar molecules have similar biological activity? J. Med. Chem. 45(19), 4350–4358 (2002)
Google Scholar
Mayr, A., Klambauer, G., Unterthiner, T., Hochreiter, S.: DeepTox: toxicity prediction using deep learning. Front. Environ. Sci. 3, 80 (2016)
Google Scholar
Miller, G.A.: The cognitive revolution: a historical perspective. Trends Cognit. Sci. 7(3), 141–144 (2003)
Google Scholar
Nikolova, N., Jaworska, J.: Approaches to measure chemical similarity— a review. QSAR Comb. Sci. 22(9–10), 1006–1026 (2003)
Google Scholar
Olshausen, A.B., Field, D.J.: Sparse coding of sensory inputs. Curr. Opin. Neurobiol. 14, 481–487 (2004)
Google Scholar
Quine, W.V.: Natural kinds. In: Rescher, N. (ed.) Essays in Honor of Carl G. Hempel, pp. 5–23. D. Reidel, Dordrecht (1970)
Google Scholar
Restrepo, G., Harré, R.: Mereology of Quantitative Structure-Activity Relationships Models. HYLE Int. J. Philos. Chem. 21(1), 19–38 (2015)
Google Scholar
Rouvray, H. (ed.): Concepts in Chemistry: A Contemporary Challenge. Wiley, New York (1997)
Google Scholar
Shepard, R.N.: The analysis of proximities: multidimensional scaling with an unknown distance function. Part 1. Psychometrika 27, 125–140 (1962)
Google Scholar
Tanimoto, T.T.: IBM Internal Report. IBM Corporation, Armonk, NY, Nov 17, 1957
Todeschini, R., Consonni, V.: Molecular Descriptors for Chemoinformatics (2 Volumes). Wiley-VCH, Weinheim (2009)
Google Scholar
Torgerson, W.S.: Multidimensional scaling of similarity. Psychometrika 30, 379–393 (1965)
Google Scholar
Toropov, A.P., et al.: Comparison of SMILES and molecular graphs as the representation of the molecular structure for QSAR analysis for mutagenic potential of polyaromatic amines. Chem. Intell. Lab. Syst. 109, 94–100 (2011)
Google Scholar
Tversky, A.: Features of similarity. Psychol. Rev. 84, 327–354 (1977)
Google Scholar
Weininger, M., Weininger, A., Weininger, J.L.: SMILES. 2. Algorithm for generation of unique SMILES notation. J. Chem. Inf. Model. 29, 97–101 (1989)
Google Scholar
Wertheimer, M.: Investigations on gestalt principles. In: Spillmann, L. (ed) On Perceived Motion and Figural Organization. Centenary Editing. MIT Press, Cambridge (2012)
Google Scholar
Winkler, D.A., Le, T.C.: Performance of deep and shallow neural networks, the universal approximation theorem, activity cliffs, and QSAR. Mol. Inform. 36(1–2), 160011 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

DEIB, Politecnico di Milano, Milan, Italy
Giuseppina Gini

Authors

Giuseppina Gini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giuseppina Gini.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gini, G. The QSAR similarity principle in the deep learning era: Confirmation or revision?. Found Chem 22, 383–402 (2020). https://doi.org/10.1007/s10698-020-09380-6

Download citation

Published: 15 July 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s10698-020-09380-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The QSAR similarity principle in the deep learning era: Confirmation or revision?

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Artificial intelligence to deep learning: machine intelligence approach for drug discovery

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The QSAR similarity principle in the deep learning era: Confirmation or revision?

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Artificial intelligence to deep learning: machine intelligence approach for drug discovery

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation