当前位置: X-MOL 学术Mach. Vis. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep understanding of shopper behaviours and interactions using RGB-D vision
Machine Vision and Applications ( IF 2.4 ) Pub Date : 2020-09-13 , DOI: 10.1007/s00138-020-01118-w
Marina Paolanti , Rocco Pietrini , Adriano Mancini , Emanuele Frontoni , Primo Zingaretti

In retail environments, understanding how shoppers move about in a store’s spaces and interact with products is very valuable. While the retail environment has several favourable characteristics that support computer vision, such as reasonable lighting, the large number and diversity of products sold, as well as the potential ambiguity of shoppers’ movements, mean that accurately measuring shopper behaviour is still challenging. Over the past years, machine-learning and feature-based tools for people counting as well as interactions analytic and re-identification were developed with the aim of learning shopper skills based on occlusion-free RGB-D cameras in a top-view configuration. However, after moving into the era of multimedia big data, machine-learning approaches evolved into deep learning approaches, which are a more powerful and efficient way of dealing with the complexities of human behaviour. In this paper, a novel VRAI deep learning application that uses three convolutional neural networks to count the number of people passing or stopping in the camera area, perform top-view re-identification and measure shopper–shelf interactions from a single RGB-D video flow with near real-time performances has been introduced. The framework is evaluated on the following three new datasets that are publicly available: TVHeads for people counting, HaDa for shopper–shelf interactions and TVPR2 for people re-identification. The experimental results show that the proposed methods significantly outperform all competitive state-of-the-art methods (accuracy of 99.5% on people counting, 92.6% on interaction classification and 74.5% on re-id), bringing to different and significative insights for implicit and extensive shopper behaviour analysis for marketing applications.

中文翻译:

使用RGB-D视觉深入了解购物者的行为和互动

在零售环境中,了解购物者如何在商店空间中移动以及如何与产品互动非常有价值。尽管零售环境具有支持计算机视觉的几个有利特征,例如合理的照明,所售产品的数量众多和种类繁多,以及购物者的运动可能存在歧义,但这意味着准确衡量购物者的行为仍然具有挑战性。在过去的几年中,开发了用于人数统计以及交互分析和重新识别的机器学习和基于功能的工具,旨在学习基于顶视图配置的无遮挡RGB-D相机的购物者技能。但是,进入多媒体大数据时代之后,机器学习方法演变为深度学习方法,这是处理人类行为复杂性的一种更强大,更有效的方法。在本文中,一个新颖的VRAI深度学习应用程序使用三个卷积神经网络来计算通过或停在相机区域中的人数,执行顶视图重新识别并测量单个RGB-D视频中的购物者与货架互动引入了具有近实时性能的流程。该框架是根据以下三个可公开获得的新数据集进行评估的:用于统计人数的TVHeads,用于购物者与货架互动的HaDa以及用于人员重新识别的TVPR2。实验结果表明,所提出的方法明显优于所有竞争性的最新方法(按人数计算的准确性为99.5%,按交互分类的准确性为92.6%,按re-id为74.5%),
更新日期:2020-09-13
down
wechat
bug