当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An improved CNN framework for detecting and tracking human body in unconstraint environment
Knowledge-Based Systems ( IF 8.8 ) Pub Date : 2019-11-11 , DOI: 10.1016/j.knosys.2019.105198
N. Kumar , N. Sukavanam

Human tracking and localization play a crucial role in many applications like accident avoidance, action recognition, safety and security, surveillance and crowd analysis. Inspired by its use and scope, we introduced a novel method for human tracking (one or many) and re-localization in a complex environment with large displacement. The model can handle complex background, variations in illumination, changes in target pose, the presence of similar target and appearance (pose and clothes), the motion of target and camera, occlusion of the target, background variation, and massive displacement of the target. Our model uses three convolutional neural network based deep architecture and cascades their learning such that it improves the overall efficiency of the model. The first network learns the pixel level representation of small regions. The second architecture uses these features and learns the displacement of a region with its category between moved, not-moved, and occluded classes. Whereas, the third network improves the displacement result of the second network by utilizing the previous two learning. We also create a semi-synthetic dataset for training purpose. The model is trained on this dataset first and tested on a subset of CamNeT, VOT2015, LITIV-tracking and Visual Tracker Benchmark database without training with real data. The proposed model yield comparative results with respect to current state-of-the-art methods based on evaluation criteria described in Object Tracking Benchmark, TPAMI 2015, CVPR 2013 and ICCV 2017.



中文翻译:

改进的CNN框架,用于在不受约束的环境中检测和跟踪人体

人工跟踪和本地化在许多应用中扮演着至关重要的角色,例如避免事故,行动识别,安全和保障,监视和人群分析。受到其用途和范围的启发,我们引入了一种用于在大位移的复杂环境中进行人类跟踪(一个或多个)和重新定位的新方法。该模型可以处理复杂的背景,照明的变化,目标姿势的变化,相似的目标和外观(姿势和衣服)的存在,目标和照相机的运动,目标的遮挡,背景变化以及目标的大量位移。我们的模型使用了基于三层卷积神经网络的深度架构,并对其学习进行级联,从而提高了模型的整体效率。第一个网络学习小区域的像素级表示。第二种体系结构使用这些功能,并学习区域在其移动类别,非移动类别和闭塞类别之间的位移。而第三网络通过利用先前的两次学习改善了第二网络的位移结果。我们还创建了一个用于训练目的的半合成数据集。该模型首先在该数据集上进行训练,然后在CamNeT,VOT2015,LITIV跟踪和Visual Tracker Benchmark数据库的子集上进行测试,而无需使用实际数据进行训练。根据对象跟踪基准,TPAMI 2015,CVPR 2013和ICCV 2017中描述的评估标准,所提出的模型相对于当前的最新方法得出了比较结果。第三网络利用先前的两次学习改进了第二网络的位移结果。我们还创建了一个用于训练目的的半合成数据集。该模型首先在该数据集上进行训练,然后在CamNeT,VOT2015,LITIV跟踪和Visual Tracker Benchmark数据库的子集上进行测试,而无需使用实际数据进行训练。根据对象跟踪基准,TPAMI 2015,CVPR 2013和ICCV 2017中描述的评估标准,建议的模型相对于当前最新方法得出的比较结果。第三网络利用先前的两次学习改进了第二网络的位移结果。我们还创建了一个用于训练目的的半合成数据集。该模型首先在该数据集上进行训练,然后在CamNeT,VOT2015,LITIV跟踪和Visual Tracker Benchmark数据库的子集上进行测试,而无需进行实际数据训练。根据对象跟踪基准,TPAMI 2015,CVPR 2013和ICCV 2017中描述的评估标准,建议的模型相对于当前最新方法得出的比较结果。LITIV跟踪和Visual Tracker Benchmark数据库,无需进行实际数据培训。根据对象跟踪基准,TPAMI 2015,CVPR 2013和ICCV 2017中描述的评估标准,建议的模型相对于当前最新方法得出的比较结果。LITIV跟踪和Visual Tracker Benchmark数据库,无需进行实际数据培训。根据对象跟踪基准,TPAMI 2015,CVPR 2013和ICCV 2017中描述的评估标准,建议的模型相对于当前最新方法得出的比较结果。

更新日期:2020-03-09
down
wechat
bug