当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The rcdk and cluster R packages applied to drug candidate selection
Journal of Cheminformatics ( IF 8.6 ) Pub Date : 2020-01-20 , DOI: 10.1186/s13321-019-0405-0
Adrian Voicu , Narcis Duteanu , Mirela Voicu , Daliborca Vlad , Victor Dumitrascu

The aim of this article is to show how thevpower of statistics and cheminformatics can be combined, in R, using two packages: rcdk and cluster. We describe the role of clustering methods for identifying similar structures in a group of 23 molecules according to their fingerprints. The most commonly used method is to group the molecules using a “score” obtained by measuring the average distance between them. This score reflects the similarity/non-similarity between compounds and helps us identify active or potentially toxic substances through predictive studies. Clustering is the process by which the common characteristics of a particular class of compounds are identified. For clustering applications, we are generally measure the molecular fingerprint similarity with the Tanimoto coefficient. Based on the molecular fingerprints, we calculated the molecular distances between the methotrexate molecule and the other 23 molecules in the group, and organized them into a matrix. According to the molecular distances and Ward ’s method, the molecules were grouped into 3 clusters. We can presume structural similarity between the compounds and their locations in the cluster map. Because only 5 molecules were included in the methotrexate cluster, we considered that they might have similar properties and might be further tested as potential drug candidates.

中文翻译:

rcdk和cluster R软件包应用于候选药物的选择

本文的目的是展示如何使用两个软件包(rcdk和cluster)在R中组合统计和化学信息学的强大功能。我们描述了根据其指纹识别一组23个分子中的相似结构的聚类方法的作用。最常用的方法是使用“分数”将分子分组,该分数是通过测量分子之间的平均距离而获得的。该分数反映了化合物之间的相似性/非相似性,并有助于我们通过预测性研究确定活性或潜在毒性物质。聚类是识别特定类别化合物共同特征的过程。对于聚类应用,我们通常使用Tanimoto系数来测量分子指纹的相似性。根据分子指纹 我们计算了甲氨蝶呤分子与该组中其他23个分子之间的分子距离,并将它们组织成一个矩阵。根据分子距离和Ward方法,将分子分为3个簇。我们可以假定化合物及其在簇图中的位置之间的结构相似性。由于甲氨蝶呤簇中仅包含5个分子,因此我们认为它们可能具有相似的特性,并可能被进一步测试为潜在的候选药物。
更新日期:2020-01-20
down
wechat
bug