当前位置: X-MOL 学术J. Intell. Manuf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Technological troubleshooting based on sentence embedding with deep transformers
Journal of Intelligent Manufacturing ( IF 8.3 ) Pub Date : 2021-06-07 , DOI: 10.1007/s10845-021-01797-w
Antonio L. Alfeo , Mario G. C. A. Cimino , Gigliola Vaglini

In nowadays manufacturing, each technical assistance operation is digitally tracked. This results in a huge amount of textual data that can be exploited as a knowledge base to improve these operations. For instance, an ongoing problem can be addressed by retrieving potential solutions among the ones used to cope with similar problems during past operations. To be effective, most of the approaches for semantic textual similarity need to be supported by a structured semantic context (e.g. industry-specific ontology), resulting in high development and management costs. We overcome this limitation with a textual similarity approach featuring three functional modules. The data preparation module provides punctuation and stop-words removal, and word lemmatization. The pre-processed sentences undergo the sentence embedding module, based on Sentence-BERT (Bidirectional Encoder Representations from Transformers) and aimed at transforming the sentences into fixed-length vectors. Their cosine similarity is processed by the scoring module to match the expected similarity between the two original sentences. Finally, this similarity measure is employed to retrieve the most suitable recorded solutions for the ongoing problem. The effectiveness of the proposed approach is tested (i) against a state-of-the-art competitor and two well-known textual similarity approaches, and (ii) with two case studies, i.e. private company technical assistance reports and a benchmark dataset for semantic textual similarity. With respect to the state-of-the-art, the proposed approach results in comparable retrieval performance and significantly lower management cost: 30-min questionnaires are sufficient to obtain the semantic context knowledge to be injected into our textual search engine.



中文翻译:

基于句子嵌入深度变换器的技术故障排除

在当今的制造业中,每个技术援助操作都被数字化跟踪。这会产生大量的文本数据,可以将其用作知识库来改进这些操作。例如,可以通过在过去操作期间用于处理类似问题的解决方案中检索潜在解决方案来解决正在进行的问题。为了有效,大多数语义文本相似性方法需要得到结构化语义上下文(例如特定行业的本体)的支持,从而导致高昂的开发和管理成本。我们通过具有三个功能模块的文本相似性方法克服了这一限制。数据准备模块提供标点符号和停用词去除以及词词形还原。预处理后的句子经过句子嵌入模块,基于 Sentence-BERT(来自 Transformers 的双向编码器表示),旨在将句子转换为固定长度的向量。它们的余弦相似度由评分模块处理,以匹配两个原始句子之间的预期相似度。最后,使用这种相似性度量来检索最适合当前问题的记录解决方案。所提出方法的有效性经过测试 (i) 针对最先进的竞争对手和两个众所周知的文本相似性方法,以及 (ii) 两个案例研究,即私营公司技术援助报告和基准数据集语义文本相似性。相对于最先进的技术,所提出的方法导致可比的检索性能和显着降低的管理成本:

更新日期:2021-06-07
down
wechat
bug