当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer
Journal of Cheminformatics ( IF 7.1 ) Pub Date : 2022-07-01 , DOI: 10.1186/s13321-022-00624-5
Zhanpeng Xu 1 , Jianhua Li 1 , Zhaopeng Yang 1 , Shiliang Li 2 , Honglin Li 2
Affiliation  

Optical chemical structure recognition from scientific publications is essential for rediscovering a chemical structure. It is an extremely challenging problem, and current rule-based and deep-learning methods cannot achieve satisfactory recognition rates. Herein, we propose SwinOCSR, an end-to-end model based on a Swin Transformer. This model uses the Swin Transformer as the backbone to extract image features and introduces Transformer models to convert chemical information from publications into DeepSMILES. A novel chemical structure dataset was constructed to train and verify our method. Our proposed Swin Transformer-based model was extensively tested against the backbone of existing publicly available deep learning methods. The experimental results show that our model significantly outperforms the compared methods, demonstrating the model’s effectiveness. Moreover, we used a focal loss to address the token imbalance problem in the text representation of the chemical structure diagram, and our model achieved an accuracy of 98.58%.

中文翻译:

SwinOCSR:使用 Swin Transformer 的端到端光学化学结构识别

从科学出版物中识别光学化学结构对于重新发现化学结构至关重要。这是一个极具挑战性的问题,目前基于规则的深度学习方法无法达到令人满意的识别率。在这里,我们提出了 SwinOCSR,一种基于 Swin Transformer 的端到端模型。该模型使用 Swin Transformer 作为骨干提取图像特征,并引入 Transformer 模型将出版物中的化学信息转换为 DeepSMILES。构建了一个新的化学结构数据集来训练和验证我们的方法。我们提出的基于 Swin Transformer 的模型已针对现有公开可用的深度学习方法的主干进行了广泛的测试。实验结果表明,我们的模型明显优于比较方法,证明模型的有效性。此外,我们使用focal loss来解决化学结构图文本表示中的token不平衡问题,我们的模型达到了98.58%的准确率。
更新日期:2022-07-01
down
wechat
bug