当前位置: X-MOL 学术IEEE Trans. Cybern. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Exploring Structural Knowledge for Automated Visual Inspection of Moving Trains
IEEE Transactions on Cybernetics ( IF 9.4 ) Pub Date : 6-19-2020 , DOI: 10.1109/tcyb.2020.2998126
Cen Chen 1 , Xiaofeng Zou 1 , Zeng Zeng 2 , Zhongyao Cheng 2 , Le Zhang 2 , Steven C. H. Hoi 3
Affiliation  

Deep learning methods are becoming the de-facto standard for generic visual recognition in the literature. However, their adaptations to industrial scenarios, such as visual recognition for machines, product streamlines, etc., which consist of countless components, have not been investigated well yet. Compared with the generic object detection, there is some strong structural knowledge in these scenarios (e.g., fixed relative positions of components, component relationships, etc.). A case worth exploring could be automated visual inspection for trains, where there are various correlated components. However, the dominant object detection paradigm is limited by treating the visual features of each object region separately without considering common sense knowledge among objects. In this article, we propose a novel automated visual inspection framework for trains exploring structural knowledge for train component detection, which is called SKTCD. SKTCD is an end-to-end trainable framework, in which the visual features of train components and structural knowledge (including hierarchical scene contexts and spatial-aware component relationships) are jointly exploited for train component detection. We propose novel residual multiple gated recurrent units (Res-MGRUs) that can optimally fuse the visual features of train components and messages from the structural knowledge in a weighted-recurrent way. In order to verify the feasibility of SKTCD, a dataset that contains high-resolution images captured from moving trains has been collected, in which 18 590 critical train components are manually annotated. Extensive experiments on this dataset and on the PASCAL VOC dataset have demonstrated that SKTCD outperforms the existing challenging baselines significantly. The dataset as well as the source code can be downloaded online (https://github.com/smartprobe/SKCD).

中文翻译:


探索行驶列车自动视觉检查的结构知识



深度学习方法正在成为文献中通用视觉识别的事实标准。然而,它们对由无数组件组成的工业场景的适应,例如机器的视觉识别、产品流线等,尚未得到很好的研究。与一般的目标检测相比,这些场景中存在一些很强的结构知识(例如,组件的固定相对位置、组件关系等)。一个值得探索的案例是火车的自动目视检查,其中有各种相关组件。然而,主要的对象检测范式受到单独处理每个对象区域的视觉特征而没有考虑对象之间的常识知识的限制。在本文中,我们提出了一种新颖的自动视觉检测框架,用于探索列车部件检测的结构知识,称为 SKTCD。 SKTCD 是一个端到端的可训练框架,其中联合利用列车组件的视觉特征和结构知识(包括分层场景上下文和空间感知组件关系)来进行列车组件检测。我们提出了新颖的残差多门循环单元(Res-MGRU),它可以以加权循环的方式最佳地融合列车组件的视觉特征和来自结构知识的消息。为了验证 SKTCD 的可行性,收集了包含从移动列车捕获的高分辨率图像的数据集,其中手动注释了 18 590 个关键列车部件。对该数据集和 PASCAL VOC 数据集的大量实验表明,SKTCD 的性能显着优于现有的具有挑战性的基线。 数据集和源代码可以在线下载(https://github.com/smartprobe/SKCD)。
更新日期:2024-08-22
down
wechat
bug