Video-based person re-identification by intra-frame and inter-frame graph neural network,Image and Vision Computing

当前位置： X-MOL 学术 › Image Vis. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Video-based person re-identification by intra-frame and inter-frame graph neural network
Image and Vision Computing ( IF 4.2 ) Pub Date : 2020-11-28 , DOI: 10.1016/j.imavis.2020.104068
Guiqing Liu , Jinzhao Wu

In the past few years, video-based person re-identification (Re-ID) have attracted growing research attention. The crucial problem for this task is how to learn robust video feature representation, which can weaken the influence of factors such as occlusion, illumination, and background etc. A great deal of previous works utilize spatio-temporal information to represent pedestrian video, but the correlations between parts of human body are ignored. In order to take advantage of the relationship among different parts, we propose a novel Intra-frame and Inter-frame Graph Neural Network (I2GNN) to solve the video-based person Re-ID task. Specifically, (1) the features from each part are treated as graph nodes from each frame; (2) the intra-frame edges are established by the correlation between different parts; (3) the inter-frame edges are constructed between the same parts across adjacent frames. I2GNN learns video representations by employing the adjacent matrix of the graph and input features to conduct graph convolution, and then adopts projection metric learning on Grassman manifold to measure the similarities between learned pedestrian features. Moreover, this paper proposes a novel occlusion-invariant term to make the part features close to their center, which can relive several uncontrolled complicated factors, such as occlusion and pose invariance. Besides, we have carried out extensive experiments on four widely used datasets: MARS, DukeMTMC-VideoReID, PRID2011, and iLIDS-VID. The experimental results demonstrate that our proposed I2GNN model is more competitive than other state-of-the-art methods.

中文翻译：

帧内和帧间图神经网络的基于视频的人员重新识别

在过去的几年中，基于视频的人员重新识别（Re-ID）引起了越来越多的研究关注。这项任务的关键问题是如何学习鲁棒的视频特征表示，从而减弱诸如遮挡，照明和背景等因素的影响。许多先前的工作都利用时空信息来表示行人视频，但是人体各部分之间的相关性被忽略。为了利用不同部分之间的关系，我们提出了一种新颖的帧内和帧间图神经网络（I2GNN），以解决基于视频的人员Re-ID任务。具体而言，（1）将每个部分的特征视为每个帧的图形节点；（2）通过不同部分之间的相关性建立帧内边缘；（3）框架间边缘是在相邻框架的相同部分之间构造的。I2GNN通过使用图的相邻矩阵和输入特征进行图卷积来学习视频表示，然后在Grassman流形上采用投影度量学习来测量学习的行人特征之间的相似性。此外，本文提出了一个新颖的遮挡不变性术语，使零件特征靠近其中心，可以重现一些不受控制的复杂因素，例如遮挡和姿势不变性。此外，我们对四个广泛使用的数据集进行了广泛的实验：MARS，DukeMTMC-VideoReID，PRID2011和iLIDS-VID。实验结果表明，我们提出的I2GNN模型比其他最新方法更具竞争力。

更新日期：2020-12-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11