Temporal Hierarchical Dictionary Guided Decoding for Online Gesture Segmentation and Recognition,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Temporal Hierarchical Dictionary Guided Decoding for Online Gesture Segmentation and Recognition
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2020-10-14 , DOI: 10.1109/tip.2020.3028962
Haoyu Chen , Xin Liu , Jingang Shi , Guoying Zhao

Online segmentation and recognition of skeleton- based gestures are challenging. Compared with offline cases, the inference of online settings can only rely on the current few frames and always completes before whole temporal movements are performed. However, incompletely performed gestures are ambiguous and their early recognition is easy to fall into local optimum. In this work, we address the problem with a temporal hierarchical dictionary to guide the hidden Markov model (HMM) decoding procedure. The intuition is that, gestures are ambiguous with high uncertainty at early performing phases, and only become discriminate after certain phases. This uncertainty naturally can be measured by entropy. Thus, we propose a measurement called “relative entropy map” (REM) to encode this temporal context to guide HMM decoding. Furthermore, we introduce a progressive learning strategy with which neural networks could learn a robust recognition of HMM states in an iterative manner. The performance of our method is intensively evaluated on three challenging databases and achieves state-of-the-art results. Our method shows the abilities of both extracting the discriminate connotations and reducing large redundancy in the HMM transition process. It is verified that our framework can achieve online recognition of continuous gesture streams even when they are halfway performed.

中文翻译：

用于在线手势分割和识别的时态分层字典引导解码

基于骨架的手势的在线分割和识别具有挑战性。与离线情况相比，在线设置的推理只能依赖于当前的几帧，并且总是在执行整个时间运动之前完成。然而，不完全执行的手势是模糊的，并且它们的早期识别很容易陷入局部最优。在这项工作中，我们使用时间分层字典来解决该问题，以指导隐马尔可夫模型（HMM）解码过程。直觉是，手势在早期执行阶段是模糊的，具有高度不确定性，只有在某些阶段之后才会变得有区别。这种不确定性自然可以通过熵来测量。因此，我们提出了一种称为“相对熵图”（REM）的测量方法来编码该时间上下文以指导 HMM 解码。此外，我们引入了一种渐进式学习策略，神经网络可以通过迭代方式学习对 HMM 状态的鲁棒识别。我们的方法的性能在三个具有挑战性的数据库上进行了深入评估，并取得了最先进的结果。我们的方法显示了提取区别内涵和减少 HMM 转换过程中大量冗余的能力。经验证，我们的框架可以实现连续手势流的在线识别，即使是在进行到一半的情况下。

更新日期：2020-10-14

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11