当前位置: X-MOL 学术J. Sound Vib. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Extracting full-field subpixel structural displacements from videos via deep learning
Journal of Sound and Vibration ( IF 4.3 ) Pub Date : 2021-04-19 , DOI: 10.1016/j.jsv.2021.116142
Lele Luan , Jingwei Zheng , Ming L. Wang , Yongchao Yang , Piervincenzo Rizzo , Hao Sun

Conventional displacement sensing techniques (e.g., laser, linear variable differential transformer) have been widely used in structural health monitoring in the past two decades. Though these techniques are capable of measuring displacement time histories with high accuracy, distinct shortcoming remains such as point-to-point contact sensing which limits its applicability in real-world problems. Video cameras have been widely used in the past years due to advantages that include low price, agility, high spatial sensing resolution, and non-contact. Compared with target tracking approaches (e.g., digital image correlation, template matching, etc.), the phase-based method is powerful for detecting small subpixel motions without the use of paints or markers on the structure surface. Nevertheless, the complex computational procedure limits its real-time inference capacity. To address this fundamental issue, we develop a deep learning framework based on convolutional neural networks (CNNs) that enable real-time extraction of full-field subpixel structural displacements from videos. In particular, two new CNN architectures are designed and trained on a dataset generated by the phase-based motion extraction method from a single lab-recorded high-speed video of a dynamic structure. As displacement is only reliable in the regions with sufficient texture contrast, the sparsity of motion field induced by the texture mask is considered via the network architecture design and loss function definition. Results show that, with the supervision of full and sparse motion field, the trained network is capable of identifying the pixels with sufficient texture contrast as well as their subpixel motions. The performance of the trained networks is tested on various videos of other structures to extract the full-field motion (e.g., displacement time histories), which indicates that the trained networks have generalizability to accurately extract full-field subpixel displacements for pixels with sufficient texture contrast.



中文翻译:

通过深度学习从视频中提取全场亚像素结构位移

在过去的二十年中,传统的位移传感技术(例如激光,线性可变差动变压器)已广泛用于结构健康监测。尽管这些技术能够高精度地测量位移时间历史,但仍然存在明显的缺点,例如点对点接触感测,这限制了其在实际问题中的适用性。由于具有价格低廉,灵活性高,空间感测分辨率高和非接触式的优势,摄像机在过去几年中被广泛使用。与目标跟踪方法(例如,数字图像相关性,模板匹配等)相比,基于相位的方法在不使用结构表面上的油漆或标记的情况下,可以检测出较小的子像素运动。尽管如此,复杂的计算过程限制了其实时推理能力。为了解决这个基本问题,我们开发了基于卷积神经网络(CNN)的深度学习框架,该框架能够从视频中实时提取全场亚像素结构位移。特别是,设计了两个新的CNN架构,并在该数据集上进行了训练,该数据集是通过基于相位的运动提取方法从动态结构的单个实验室记录的高速视频生成的。由于位移仅在具有足够纹理对比度的区域中是可靠的,因此可以通过网络体系结构设计和损失函数定义来考虑由纹理蒙版引起的运动场的稀疏性。结果表明,在充分而稀疏的运动场的监督下,训练有素的网络能够识别具有足够纹理对比度的像素及其子像素运动。在其他结构的各种视频上测试训练后的网络的性能以提取全场运动(例如,位移时间历史),这表明训练后的网络具有泛化性,可以为具有足够纹理的像素准确地提取出整个场的子像素位移对比。

更新日期:2021-05-03
down
wechat
bug