当前位置: X-MOL 学术ACS Omega › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Rethinking the Masking Strategy for Pretraining Molecular Graphs from a Data-Centric View
ACS Omega ( IF 4.1 ) Pub Date : 2024-05-03 , DOI: 10.1021/acsomega.3c09512
Wei Lin 1 , Chi Chung Alan Fung 1
Affiliation  

Node-level self-supervised learning has been widely applied for pretraining molecular graphs. Attribute Masking (AttrMask) is pioneering work in this field, and its improved methods focus on enhancing the capacity of the backbone models by incorporating additional modules. However, these methods overlook the imbalanced atom distribution due to employing only the random masking strategy to mask atoms for pretraining. According to the properties of molecules, we propose a weighted masking strategy to enhance the capacity of pretrained models by more effective utilization of molecular information while pretraining. Our experimental results demonstrate that AttrMask combined with our proposed weighted masking strategy yields superior performance compared to the random masking strategy, even surpassing the model-centric improvement methods without increasing the parameters. Additionally, our weighted masking strategy can be extended to other pretraining methods to achieve enhanced performance.

中文翻译:


从以数据为中心的角度重新思考预训练分子图的掩蔽策略



节点级自监督学习已广泛应用于预训练分子图。属性屏蔽(AttrMask)是该领域的开创性工作,其改进方法侧重于通过合并附加模块来增强骨干模型的能力。然而,这些方法由于仅采用随机掩蔽策略来掩蔽原子进行预训练,因此忽略了不平衡的原子分布。根据分子的特性,我们提出了一种加权掩蔽策略,通过在预训练时更有效地利用分子信息来增强预训练模型的能力。我们的实验结果表明,与随机掩蔽策略相比,AttrMask 与我们提出的加权掩蔽策略相结合产生了优越的性能,甚至在不增加参数的情况下超越了以模型为中心的改进方法。此外,我们的加权掩蔽策略可以扩展到其他预训练方法以实现增强的性能。
更新日期:2024-05-03
down
wechat
bug