Pose estimation at night in infrared images using a lightweight multi-stage attention network,Signal, Image and Video Processing

当前位置： X-MOL 学术 › Signal Image Video Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Pose estimation at night in infrared images using a lightweight multi-stage attention network
Signal, Image and Video Processing ( IF 2.0 ) Pub Date : 2021-05-03 , DOI: 10.1007/s11760-021-01916-3
Ying Zang , Chunpeng Fan , Zeyu Zheng , Dongsheng Yang

Human Keypoints Detection is a relatively basic task in computer vision; it is the pre-task of human action recognition, behavior analysis and human–computer interaction. Since most abnormal actions occur at night, how to effectively extract skeleton sequence data in a low-light or completely dark environment poses a huge challenge for its identification. This paper proposes to use far infrared images to detection key points of the human body, which can solve the problem of human pose estimation under challenging weather conditions such as total darkness, smoke, inclement weather and glare. However, far-infrared images have some shortcomings, such as low resolution, noise and thermal characteristics; the skeleton data need to be provided in real time for the next stage of task. Based on the above reasons, this paper proposes a lightweight multi-stage attention network (LMANet) to detect the key points of human at night. This new network structure adds context information through the large receptive field, which helps to assist the detection of neighboring key points through this information, but for the sake of lightweight consideration, this article only extends the network to two stages. In addition, this article uses the attention module to effectively select channels with a large amount of information and highlight the features of key points, while eliminating background interference. In order to detect key points of the human in various complex environments, we use techniques such as difficult sample mining which improves the accuracy of key points with low confidence. Our network has been verified on two visible light datasets, fully demonstrating excellent performance. This paper successfully introduces far-infrared images into the field of pose estimation, because there is no public dataset for far-infrared pose estimation. In this paper, 700 images are selected for annotation from multiple public far-infrared object detection, segmentation and action recognition datasets; our algorithm is verified on this dataset; the effect is very good. After the paper is published, we will publish our key points of the human body annotated documents.

中文翻译：

使用轻量级多级注意力网络在红外图像中进行夜间姿态估计

人体关键点检测是计算机视觉中的一个相对基本的任务。它是人类动作识别，行为分析和人机交互的首要任务。由于大多数异常动作是在夜间发生的，因此如何在弱光或完全黑暗的环境中有效提取骨架序列数据对其识别提出了巨大的挑战。本文提出使用远红外图像检测人体关键点，可以解决在全黑，浓烟，恶劣天气和眩光等恶劣天气条件下人体姿态估计的问题。但是，远红外图像存在分辨率低，噪声大，热特性差等缺点。下一个任务阶段需要实时提供框架数据。基于上述原因，本文提出了一种轻量级的多阶段注意力网络（LMANet）来检测夜间人的关键点。这种新的网络结构通过较大的接收域添加了上下文信息，这有助于通过此信息帮助检测相邻关键点，但是出于轻量级考虑，本文仅将网络扩展到两个阶段。此外，本文使用关注模块有效地选择了包含大量信息的频道并突出显示关键点的特征，同时消除了背景干扰。为了检测各种复杂环境中的人体关键点，我们使用了诸如困难的样本挖掘之类的技术，该技术以较低的置信度提高了关键点的准确性。我们的网络已在两个可见光数据集上得到验证，充分展示了出色的性能。由于没有公开的远红外姿态估计数据集，本文成功地将远红外图像引入了姿态估计领域。本文从多个公共远红外目标检测，分割和动作识别数据集中选择了700张图像进行注释。我们的算法已在该数据集上得到验证；效果很好。论文发表后，我们将发布我们人体注释文件的要点。细分和动作识别数据集；我们的算法已在该数据集上得到验证；效果很好。论文发表后，我们将发布我们人体注释文件的要点。细分和动作识别数据集；我们的算法已在该数据集上得到验证；效果很好。论文发表后，我们将发布我们人体注释文件的要点。

更新日期：2021-05-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文