当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimal HTS Fingerprint Definitions by Using a Desirability Function and a Genetic Algorithm
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2018-02-09 00:00:00 , DOI: 10.1021/acs.jcim.7b00447
Alvaro Cortes Cabrera 1 , Paula M. Petrone 2
Affiliation  

The use of compound biological fingerprints built on data from high-throughput screening (HTS) campaigns, or HTS fingerprints, is a novel cheminformatics method of representing compounds by integrating chemical and biological activity data that is gaining momentum in its application to drug discovery, including hit expansion, target identification, and virtual screening. HTS fingerprints present two major limitations, noise and missing data, which are intrinsic to the high-throughput data acquisition technologies and to the assay availability or assay selection procedure used for their construction. In this work, we present a methodology to define an optimal set of HTS fingerprints by using a desirability function that encodes the principles of maximum biological and chemical space coverage and minimum redundancy between HTS assays. We used a genetic algorithm to optimize the desirability function and obtained an optimal fingerprint that was evaluated for performance in a test set of 33 diverse assays. Our results show that the optimal HTS fingerprint represents compounds in chemical biology space using 25% fewer assays. When used for virtual screening, the optimal HTS fingerprint obtained equivalent performance, in terms of both area under the curve and enrichment factors, to full fingerprints for 27 out of 33 test assays, while randomly assembled fingerpints could achieve equivalent performance in only 23 test assays.

中文翻译:

使用期望函数和遗传算法的最佳HTS指纹定义

使用基于高通量筛选(HTS)活动的数据构建的化合物生物指纹或HTS指纹,是一种通过整合化学和生物活性数据来表示化合物的新型化学信息学方法,该数据在其应用于药物发现中的应用不断发展,包括命中扩展,目标识别和虚拟筛选。HTS指纹存在两个主要限制,即噪声和丢失数据,这对于高通量数据采集技术以及用于其构建的化验可用性或化验选择程序是固有的。在这项工作中,我们提出了一种方法,该方法使用可取性函数定义最佳的HTS指纹集,该函数编码最大生物学和化学空间覆盖率以及HTS分析之间的最小冗余性的原理。我们使用遗传算法优化了所需函数,并获得了最佳指纹图谱,该指纹图谱在33种不同测定的测试集中进行了性能评估。我们的结果表明,最佳的HTS指纹图谱使用的化验次数减少了25%,代表了化学生物学领域的化合物。当用于虚拟筛选时,最佳的HTS指纹图谱(在曲线下的面积和富集因子)在33种测试中有27种获得了完全相同的指纹图谱,而随机组装的指纹图谱仅在23种测试中可以实现等效的表现。 。我们的结果表明,最佳的HTS指纹图谱使用的化验次数减少了25%,代表了化学生物学领域的化合物。当用于虚拟筛选时,最佳的HTS指纹图谱(在曲线下的面积和富集因子)在33种检测方法中的27种获得完全指纹图谱的性能,而随机组装的指纹图谱仅在23种检测方法中可以达到等效性能。 。我们的结果表明,最佳的HTS指纹图谱使用的化验次数减少了25%,代表了化学生物学领域的化合物。当用于虚拟筛选时,最佳的HTS指纹图谱(在曲线下的面积和富集因子)在33种测试中有27种获得了完全相同的指纹图谱,而随机组装的指纹图谱仅在23种测试中可以实现等效的表现。 。
更新日期:2018-02-09
down
wechat
bug