当前位置: X-MOL 学术Neural Process Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Lightweight Neural Learning Algorithm for Real-Time Facial Feature Tracking System via Split-Attention and Heterogeneous Convolution
Neural Processing Letters ( IF 2.6 ) Pub Date : 2022-08-13 , DOI: 10.1007/s11063-022-10951-1
Yuandong Ma , Qing Song , Mengjie Hu , Xiaotong Zhu

Object tracking has made remarkable progress in the past few years. But most advanced trackers are becoming more expensive, which limits their deployment in mobile devices with limited resources. In addition, the current popular tracker realizes similarity learning through the feature correlation between multiple branches. Some of these cross-correlation methods lost a lot of face information, and some introduced a lot of unfavorable background information. Based on this motivation, this paper is committed to reducing the number of algorithm parameters and enhancing the ability of feature extraction. Heterogeneous convolution is introduced into the backbone network to reduce the convolution kernel parameters. Add a search box mechanism to dynamically adjust the network receiving domain to generate more feature maps with cheap operations. Furthermore, we also integrate the split-attention mechanism into the backbone network to standardize the arrangement of heterogeneous convolution. To evaluate the model, we conducted experiments on challenging VTB datasets and actual shooting datasets, which contain 82,351 facial features. Experimental results show that our method distance precision (DP) and overlap success precision (OP) are 93.5% and 67.5% respectively, which are comparable with the state-of-the-art object tracking methods and reduce about one-third of the parameters. Meanwhile, the feature mapping of each convolution module is explored, and the interpretation of lightweight convolution is given.



中文翻译:

一种基于Split-Attention和异构卷积的实时面部特征跟踪系统的轻量级神经学习算法

对象跟踪在过去几年中取得了显着进展。但是大多数高级跟踪器变得越来越昂贵,这限制了它们在资源有限的移动设备中的部署。此外,目前流行的tracker通过多个分支之间的特征相关性来实现相似度学习。这些互相关方法有的丢失了很多人脸信息,有的引入了很多不利的背景信息。基于此动机,本文致力于减少算法参数数量,增强特征提取能力。将异构卷积引入主干网络以减少卷积核参数。添加搜索框机制以动态调整网络接收域以生成更多具有廉价操作的特征图。此外,我们还将split-attention机制集成到主干网络中,以标准化异构卷积的排列。为了评估模型,我们对具有挑战性的 VTB 数据集和实际拍摄数据集进行了实验,其中包含 82,351 个面部特征。实验结果表明,我们的方法距离精度(DP)和重叠成功精度(OP)分别为 93.5% 和 67.5%,与最先进的目标跟踪方法相当,并且减少了大约三分之一的参数. 同时探索了各个卷积模块的特征映射,给出了轻量级卷积的解释。我们对具有挑战性的 VTB 数据集和实际拍摄数据集进行了实验,其中包含 82,351 个面部特征。实验结果表明,我们的方法距离精度(DP)和重叠成功精度(OP)分别为 93.5% 和 67.5%,与最先进的目标跟踪方法相当,并且减少了大约三分之一的参数. 同时探索了各个卷积模块的特征映射,给出了轻量级卷积的解释。我们对具有挑战性的 VTB 数据集和实际拍摄数据集进行了实验,其中包含 82,351 个面部特征。实验结果表明,我们的方法距离精度(DP)和重叠成功精度(OP)分别为 93.5% 和 67.5%,与最先进的目标跟踪方法相当,并且减少了大约三分之一的参数. 同时探索了各个卷积模块的特征映射,给出了轻量级卷积的解释。

更新日期:2022-08-13
down
wechat
bug