JSE: Joint Semantic Encoder for zero-shot gesture learning,Pattern Analysis and Applications

当前位置： X-MOL 学术 › Pattern Anal. Applic. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

JSE: Joint Semantic Encoder for zero-shot gesture learning
Pattern Analysis and Applications ( IF 3.7 ) Pub Date : 2021-06-11 , DOI: 10.1007/s10044-021-00992-y
Naveen Madapana , Juan Wachs

Zero-shot learning (ZSL) is a transfer learning paradigm that aims to recognize unseen categories just by having a high-level description of them. While deep learning has greatly pushed the limits of ZSL for object classification, ZSL for gesture recognition (ZSGL) remains largely unexplored. Previous attempts to address ZSGL were focused on the creation of gesture attributes and algorithmic improvements, and there is little or no research concerned with feature selection for ZSGL. It is indisputable that deep learning has obviated the need for feature engineering for problems with large datasets. However, when the data are scarce, it is critical to leverage the domain information to create discriminative input features. The main goal of this work is to study the effect of three different feature extraction techniques (velocity, heuristical and latent features) on the performance of ZSGL. In addition, we propose a bilinear auto-encoder approach, referred to as Joint Semantic Encoder (JSE), for ZSGL that jointly minimizes the reconstruction, semantic and classification losses. We conducted extensive experiments to compare and contrast the feature extraction techniques and to evaluate the performance of JSE with respect to existing ZSL methods. For attribute-based classification scenario, irrespective of the feature type, results showed that JSE outperforms other approaches by 5% (p<0.01). When JSE is trained with heuristical features in across-category condition, we showed that JSE significantly outperforms other methods by 5% (p<0.01)).

中文翻译：

JSE：用于零次手势学习的联合语义编码器

零样本学习 (ZSL) 是一种迁移学习范式，旨在通过对它们进行高级描述来识别看不见的类别。虽然深度学习极大地推动了 ZSL 在对象分类方面的极限，但用于手势识别 (ZSGL) 的 ZSL 在很大程度上仍未得到探索。以前解决 ZSGL 的尝试主要集中在创建手势属性和算法改进上，很少或根本没有研究涉及 ZSGL 的特征选择。毫无疑问，深度学习已经消除了对大型数据集问题进行特征工程的需要。然而，当数据稀缺时，利用域信息来创建判别输入特征至关重要。这项工作的主要目标是研究三种不同的特征提取技术（速度、启发式和潜在特征）对 ZSGL 性能的影响。此外，我们为 ZSGL 提出了一种双线性自动编码器方法，称为联合语义编码器 (JSE)，可共同最小化重建、语义和分类损失。我们进行了大量实验来比较和对比特征提取技术，并评估 JSE 相对于现有 ZSL 方法的性能。对于基于属性的分类场景，无论特征类型如何，结果表明 JSE 优于其他方法 5% ( p <0.01)。当 JSE在跨类别中使用启发式特征进行训练时条件下，我们表明 JSE 显着优于其他方法 5% ( p <0.01))。

更新日期：2021-06-13

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11