当前位置: X-MOL 学术Nat. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bridging chemistry and artificial intelligence by a reaction description language
Nature Machine Intelligence ( IF 23.9 ) Pub Date : 2025-05-13 , DOI: 10.1038/s42256-025-01032-8
Jiacheng Xiong ,  Wei Zhang ,  Yinquan Wang ,  Jiatao Huang ,  Yuqi Shi ,  Mingyan Xu ,  Manjia Li ,  Zunyun Fu ,  Xiangtai Kong ,  Yitian Wang ,  Zhaoping Xiong ,  Mingyue Zheng

With the fast-paced development of artificial intelligence, large language models are increasingly used to tackle various scientific challenges. A critical step in this process is converting domain-specific data into a sequence of tokens for language modelling. In chemistry, molecules are often represented by molecular linear notations, and chemical reactions are depicted as sequence pairs of reactants and products. However, this approach does not capture atomic and bond changes during reactions. Here, we present ReactSeq, a reaction description language that defines molecular editing operations for step-by-step chemical transformation. Based on ReactSeq, language models for retrosynthesis prediction may consistently excel in all benchmark tests, and demonstrate promising emergent abilities in the human-in-the-loop and explainable artificial intelligence. Moreover, ReactSeq has allowed us to obtain universal and reliable representations of chemical reactions, which enable navigation of the reaction space and aid in the recommendation of experimental procedures and prediction of reaction yields. We foresee that ReactSeq can serve as a bridge to narrow the gap between chemistry and artificial intelligence.



中文翻译:


通过反应描述语言在化学和人工智能之间架起桥梁



随着人工智能的快速发展,大型语言模型越来越多地用于应对各种科学挑战。此过程的一个关键步骤是将特定于域的数据转换为用于语言建模的标记序列。在化学中,分子通常由分子线性符号表示,而化学反应则被描述为反应物和产物的序列对。然而,这种方法不能捕捉反应过程中的原子和键变化。在这里,我们介绍了 ReactSeq,这是一种反应描述语言,它定义了用于逐步化学转化的分子编辑作。基于 ReactSeq,用于逆合成预测的语言模型可能在所有基准测试中始终表现出色,并在人机协同和可解释的人工智能中表现出有前途的新兴能力。此外,ReactSeq 使我们能够获得通用且可靠的化学反应表示,从而能够导航反应空间,并有助于推荐实验程序和预测反应产率。我们预见到 ReactSeq 可以成为缩小化学和人工智能之间差距的桥梁。

更新日期:2025-05-13
down
wechat
bug