当前位置: X-MOL 学术ACS Environ. Au › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
In Silico Structure Predictions for Non-targeted Analysis: From Physicochemical Properties to Molecular Structures
ACS Environmental Au Pub Date : 2022-06-01 , DOI: 10.1021/jasms.1c00386
Dimitri Abrahamsson 1 , Adi Siddharth 1 , Thomas M. Young 2 , Marina Sirota 3, 4 , June-Soo Park 1, 5 , Jonathan W. Martin 6 , Tracey J. Woodruff 1
Affiliation  

While important advances have been made in high-resolution mass spectrometry (HRMS) and its applications in non-targeted analysis (NTA), the number of identified compounds in biological and environmental samples often does not exceed 5% of the detected chemical features. Our aim was to develop a computational pipeline that leverages data from HRMS but also incorporates physicochemical properties (equilibrium partition ratios between organic solvents and water; Ksolvent–water) and can propose molecular structures for detected chemical features. As these physicochemical properties are often sufficiently different across isomers, when put together, they can form a unique profile for each isomer, which we describe as the “physicochemical fingerprint”. In our study, we used a comprehensive database of compounds that have been previously reported in human blood and collected their Ksolvent–water values for 129 partitioning systems. We used RDKit to calculate the number of RDKit fragments and the number of RDKit bits per molecule. We then developed and trained an artificial neural network, which used as an input the physicochemical fingerprint of a chemical feature and predicted the number and types of RDKit fragments and RDKit bits present in that structure. These were then used to search the database and propose chemical structures. The average success rate of predicting the right chemical structure ranged from 60 to 86% for the training set and from 48 to 81% for the testing set. These observations suggest that physicochemical fingerprints can assist in the identification of compounds with NTA and substantially improve the number of identified compounds.

中文翻译:

用于非靶向分析的计算机结构预测:从物理化学性质到分子结构

虽然高分辨率质谱 (HRMS) 及其在非靶向分析 (NTA) 中的应用取得了重要进展,但生物和环境样品中已鉴定化合物的数量通常不超过检测到的化学特征的 5%。我们的目标是开发一种计算管道,该管道利用来自 HRMS 的数据,但也结合了物理化学特性(有机溶剂和水之间的平衡分配比;K溶剂-水) 并且可以为检测到的化学特征提出分子结构。由于这些物理化学性质通常在异构体之间存在很大差异,因此当它们放在一起时,它们可以形成每个异构体的独特特征,我们将其描述为“物理化学指纹”。在我们的研究中,我们使用了以前在人体血液中报道过的化合物的综合数据库,并收集了它们的K溶剂-水129 个分区系统的值。我们使用 RDKit 来计算 RDKit 片段的数量和每个分子的 RDKit 比特数。然后,我们开发并训练了一个人工神经网络,该网络将化学特征的物理化学指纹用作输入,并预测该结构中存在的 RDKit 片段和 RDKit 位的数量和类型。然后将这些用于搜索数据库并提出化学结构。训练集预测正确化学结构的平均成功率为 60% 到 86%,测试集预测正确化学结构的成功率为 48% 到 81%。这些观察结果表明,物理化学指纹可以帮助鉴定具有 NTA 的化合物,并大大提高已鉴定化合物的数量。
更新日期:2022-06-01
down
wechat
bug