当前位置: X-MOL 学术Macromol. Theor. Simul. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
NNI-SMOTE-XGBoost: A Novel Small Sample Analysis Method for Properties Prediction of Polymer Materials
Macromolecular Theory and Simulations ( IF 1.8 ) Pub Date : 2021-05-02 , DOI: 10.1002/mats.202100010
Dazi Li 1 , Jianxun Liu 1 , Jun Liu 2
Affiliation  

Despite the usage of machine learning accelerating the properties prediction of polymer materials, obtaining a large number of samples to achieve accurate and fast predictions remains a challenge because of the complex and lengthy experimental process. In this work, an advanced prediction model for the small sample analysis is presented by an ensemble learning algorithm called extreme gradient boosting (XGBoost) based on nearest neighbor interpolation (NNI) and synthetic minority oversampling technique (SMOTE). Different from directly using small sample prediction algorithms, a brand-new idea based on feature engineering is proposed. NNI algorithm is used to interpolate the original data set to solve the data insufficiency. SMOTE algorithm is used to solve the data imbalance problem by increasing minority samples. The expanded data set is used to build an XGBoost prediction model. A model for predicting Akron abrasion of rubber through mechanical properties is established via the proposed method. The original data set is expanded to 710 samples through two different interpolations. Experimental results show that better prediction accuracy and generalization ability are obtained than traditional algorithms. The Akron abrasion is found to be the most related to the elongation at break of polymer materials.

中文翻译:

NNI-SMOTE-XGBoost:一种用于聚合物材料性能预测的新型小样本分析方法

尽管使用机器学习来加速聚合物材料的性能预测,但由于复杂而漫长的实验过程,获得大量样品以实现准确和快速的预测仍然是一个挑战。在这项工作中,基于最近邻插值 (NNI) 和合成少数过采样技术 (SMOTE) 的称为极端梯度提升 (XGBoost) 的集成学习算法提出了一种用于小样本分析的高级预测模型。与直接使用小样本预测算法不同,提出了一种基于特征工程的全新思路。使用NNI算法对原始数据集进行插值,解决数据不足的问题。SMOTE算法通过增加少数样本来解决数据不平衡问题。扩展后的数据集用于构建 XGBoost 预测模型。通过所提出的方法建立了通过机械性能预测橡胶阿克伦磨损的模型。原始数据集通过两种不同的插值扩展到 710 个样本。实验结果表明,与传统算法相比,获得了更好的预测精度和泛化能力。发现阿克伦磨损与聚合物材料的断裂伸长率最相关。实验结果表明,与传统算法相比,获得了更好的预测精度和泛化能力。发现阿克伦磨损与聚合物材料的断裂伸长率最相关。实验结果表明,与传统算法相比,获得了更好的预测精度和泛化能力。发现阿克伦磨损与聚合物材料的断裂伸长率最相关。
更新日期:2021-05-02
down
wechat
bug