当前位置: X-MOL 学术SAR QSAR Environ. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SAR and QSAR research on tyrosinase inhibitors using machine learning methods
SAR and QSAR in Environmental Research ( IF 3 ) Pub Date : 2021-02-01 , DOI: 10.1080/1062936x.2020.1862297
Y. Wu 1 , D. Huo 1 , G. Chen 2 , A. Yan 1
Affiliation  

ABSTRACT

Tyrosinase is a key rate-limiting enzyme in the process of melanin synthesis, which is closely related to human pigmentation disorders. Tyrosinase inhibitors can down-regulate tyrosinase to effectively reduce melanin synthesis. In this work, we conducted structure-activity relationship (SAR) study on 1097 diverse mushroom tyrosinase inhibitors. We applied five kinds of machine learning methods to develop 15 classification models. Model 5B built by fully connected neural networks and ECFP4 fingerprints achieved the highest prediction accuracy of 91.36% and Matthews correlation coefficient (MCC) of 0.81 on the test set. The applicability domains (AD) of classification models were defined by d S T D P R O method. Moreover, we clustered the 1097 inhibitors into eight subsets by K-Means to figure out inhibitors’ structural features. In addition, 10 quantitative structure–activity relationship (QSAR) models were constructed by four machine learning methods based on 813 inhibitors. Model 6 J, the best QSAR model, was developed by fully connected neural networks with 50 RDKit descriptors. It resulted in a coefficient of determination (r 2) of 0.770 and a root mean squared error (RMSE) of 0.482 on the test set. The AD of Model 6 J was visualized by Williams plot. The models built in this study can be obtained from the authors.



中文翻译:

机器学习方法对酪氨酸酶抑制剂的SAR和QSAR研究

摘要

酪氨酸酶是黑色素合成过程中的关键限速酶,与人类色素沉着症密切相关。酪氨酸酶抑制剂可以下调酪氨酸酶以有效减少黑色素的合成。在这项工作中,我们对1097种不同的蘑菇酪氨酸酶抑制剂进行了结构-活性关系(SAR)研究。我们应用了五种机器学习方法来开发15种分类模型。由完全连接的神经网络和ECFP4指纹建立的5B模型在测试集上实现了91.36%的最高预测准确性和0.81的Matthews相关系数(MCC)。分类模型的适用范围(AD)由 d 小号 Ť d - P [R Ø 方法。此外,我们通过K-Means将1097种抑制剂分为8个子集,以找出抑制剂的结构特征。此外,通过四种基于813种抑制剂的机器学习方法构建了10个定量构效关系(QSAR)模型。最好的QSAR模型6 J是由具有50个RDKit描述子的全连接神经网络开发的。在测试集上得出的确定系数(r 2)为0.770,均方根误差(RMSE)为0.482。模型6 J的AD通过Williams图可视化。本研究中建立的模型可以从作者那里获得。

更新日期:2021-02-19
down
wechat
bug