当前位置: X-MOL 学术Complexity › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fault Diagnosis of Signal Equipment on the Lanzhou-Xinjiang High-Speed Railway Using Machine Learning for Natural Language Processing
Complexity ( IF 2.3 ) Pub Date : 2021-07-28 , DOI: 10.1155/2021/9126745
Lei Shi 1 , Yulin Zhu 1 , Youpeng Zhang 1, 2 , Zhongji Su 1
Affiliation  

The Lanzhou-Xinjiang (Lan-Xin) high-speed railway is one of the principal sections of the railway network in western China, and signal equipment is of great importance in ensuring the safe and efficient operation of the high-speed railway. Over a long period, in the railway operation and maintenance process, the railway signaling and communications department has recorded a large amount of unstructured text information about equipment faults in the form of natural language. However, due to irregularities in the recording methods of these data, it is difficult to use directly. In this paper, a method based on natural language processing (NLP) was adopted to analyze and classify this information. First, the Latent Dirichlet Allocation (LDA) topic model was used to extract the semantic features of the text, which were then expressed in the corresponding topic feature space. Next, the Support Vector Machine (SVM) algorithm was used to construct a signal equipment fault diagnostic model that reduced the impact of sample data imbalance on the classification accuracy. This was compared and analyzed with the traditional Naive Bayes (NB), Logistic Regression (LR), Random Forest (RF), and K-Nearest Neighbor (KNN) algorithms. This study used signal equipment failure text data from the Lan-Xin high-speed railway to conduct experimental analysis and verify the effectiveness of the proposed method. Experiments showed that the accuracy of the SVM classification algorithm could reach 0.84 after being combined with the LDA topic model, which verifies that the natural language processing method can effectively realize the fault diagnosis of signal equipment and has certain guiding significance for the maintenance of field signal equipment.

中文翻译:

利用机器学习进行自然语言处理的兰新高铁信号设备故障诊断

兰新(兰新)高铁是西部铁路网的主要路段之一,信号设备对保障高铁安全高效运行具有重要意义。长期以来,在铁路运维过程中,铁路信号通信部门以自然语言的形式记录了大量关于设备故障的非结构化文本信息。但是,由于这些数据的记录方式不规范,很难直接使用。本文采用基于自然语言处理(NLP)的方法对这些信息进行分析和分类。首先,使用潜在狄利克雷分配(LDA)主题模型提取文本的语义特征,然后在相应的主题特征空间中表达。接下来,使用支持向量机(SVM)算法构建信号设备故障诊断模型,降低样本数据不平衡对分类精度的影响。这与传统的朴素贝叶斯 (NB)、逻辑回归 (LR)、随机森林 (RF) 和 K-最近邻 (KNN) 算法进行了比较和分析。本研究利用兰新高铁信号设备故障文本数据进行实验分析,验证了所提方法的有效性。实验表明,与LDA主题模型结合后,SVM分类算法的准确率可达0.84,
更新日期:2021-07-28
down
wechat
bug