当前位置: X-MOL 学术J. Math. Chem. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Molecular discovery by optimal sequential search
Journal of Mathematical Chemistry ( IF 1.7 ) Pub Date : 2019-08-31 , DOI: 10.1007/s10910-019-01062-9
Genyuan Li

In the development of a new compound in chemistry and molecular biology, especially a new medicine in pharmaceutical industry, we often need to find candidate(s), a molecule or molecules, with the best desired property (e.g., binding affinity in medicine) from a large set of molecules with the same scaffold but m distinct functional substitutes at each of its n different sites. The total number Nlib\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{\mathrm{lib}}$$\end{document} of molecules in this library is mn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m^n$$\end{document}. In some cases, Nlib\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{\mathrm{lib}}$$\end{document} can be a very large number (e.g., millions). This is a challenging task because it is costly and often infeasible to synthesize and test all of these molecules. A new algorithm referred to as optimal sequential search is developed to overcome this difficulty. Especially, this algorithm is chemically intuitive which only uses the information of molecule composition, and accessible to practical chemists. The algorithm can be applied to small, medium and large size molecule libraries. With syntheses and property measurements for a limited number of molecules, the top best candidate molecules can be effectively captured from the whole library. Three examples with library size 64, 160,000 and 1,048,576, respectively, are used for illustration. For the first small library, syntheses and property measurements of 17 molecules are sufficient to capture the top 7 best candidate molecules; for the two medium and large libraries, syntheses and property measurements of about one thousand molecules can capture most or a large part of the top 500, especially the top 100 best candidate molecules. However, the algorithm needs to perform multiple (e.g., hundreds of) iterative syntheses and property measurements. The time cost may not be acceptable if the algorithm is performed manually. To make the algorithm practical, automation of the sequential searching process is the following task.

中文翻译:

通过最优顺序搜索进行分子发现

总数 Nlib\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{ \oddsidemargin}{-69pt} \begin{document}$$N_{\mathrm{lib}}$$\end{document} 这个库中的分子是 mn\documentclass[12pt]{minimal} \usepackage{amsmath} \ usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m^n$ $\end{文档}。在某些情况下,Nlib\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin} {-69pt} \begin{document}$$N_{\mathrm{lib}}$$\end{document} 可以是一个非常大的数字(例如,百万)。这是一项具有挑战性的任务,因为合成和测试所有这些分子成本高昂且通常不可行。开发了一种称为最优顺序搜索的新算法来克服这一困难。特别是,该算法化学直观,仅使用分子组成信息,可供实际化学家使用。该算法可应用于小型、中型和大型分子库。通过对有限数量的分子进行合成和性质测量,可以有效地从整个库中捕获最佳候选分子。使用库大小分别为 64、160,000 和 1,048,576 的三个示例进行说明。对于第一个小型文库,17 个分子的合成和性质测量足以捕获前 7 个最佳候选分子;对于这两个大中型文库,大约一千个分子的合成和性质测量可以捕获前500名中的大部分或大部分,尤其是前100名最佳候选分子。然而,该算法需要执行多个(例如,数百个)迭代合成和属性测量。如果算法是手动执行的,时间成本可能无法接受。为了使算法实用,顺序搜索过程的自动化是以下任务。
更新日期:2019-08-31
down
wechat
bug