当前位置: X-MOL 学术Enterp. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Valuing free-form text data from maintenance logs through transfer learning with CamemBERT
Enterprise Information Systems ( IF 4.4 ) Pub Date : 2020-07-07 , DOI: 10.1080/17517575.2020.1790043
Juan Pablo Usuga Cadavid 1, 2 , Bernard Grabot 3 , Samir Lamouri 1 , Robert Pellerin 4 , Arnaud Fortin 2
Affiliation  

ABSTRACT

Coupling a production scheduling process with maintenance logs can provide important advantages. For instance, this enables the adaptation of planning to the reality of the shop floor. Nevertheless, maintenance logs are often highly unstructured, as they mainly rely on free-form text comments from operators, and are imbalanced, as commonplace issues happen more often than critical problems. This hinders the application of machine learning methods to exploit this data. Thus, this study explores the use of a recent model named CamemBERT to tackle these difficulties through transfer learning. More specifically, the purpose is to predict the criticality and duration of a maintenance issue from the description provided. Findings suggest that fine-tuning CamemBERT outperforms other classical and feature-based approaches. Furthermore, the class imbalance problem is addressed from a data pre-processing and training perspective: firstly, k-means with silhouette diagrams allowed the creation of more homogenous classes, and secondly, the use of resampling enabled an improvement in the model’s performance.



中文翻译:

通过使用 CamemBERT 的迁移学习评估维护日志中的自由格式文本数据

摘要

将生产调度过程与维护日志结合起来可以提供重要的优势。例如,这可以使计划适应车间的实际情况。然而,维护日志通常是高度非结构化的,因为它们主要依赖于操作员的自由格式文本评论,并且是不平衡的,因为普通问题比关键问题更容易发生。这阻碍了应用机器学习方法来利用这些数据。因此,本研究探索了使用名为 CamemBERT 的最新模型通过迁移学习来解决这些困难。更具体地说,目的是根据所提供的描述预测维护问题的严重性和持续时间。研究结果表明,微调 CamemBERT 优于其他经典和基于特征的方法。此外,

更新日期:2020-07-07
down
wechat
bug