当前位置: X-MOL 学术IEEE Trans. Circ. Syst. Video Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multibranch Adversarial Regression for Domain Adaptative Hand Pose Estimation
IEEE Transactions on Circuits and Systems for Video Technology ( IF 8.4 ) Pub Date : 2022-03-10 , DOI: 10.1109/tcsvt.2022.3158676
Rui Jin 1 , Jing Zhang 2 , Jianyu Yang 1 , Dacheng Tao 3
Affiliation  

Although hand pose estimation has achieved a great success in recent years, there are still challenges with RGB-based estimation tasks, the most significant of which is the absence of labeled training data. At present, the synthetic dataset has plenty of images with accurate annotation, but the difference from real-world datasets affects generalization. Therefore, a transfer learning strategy, which tries to transfer knowledge from a labeled source domain to an unlabeled target domain, is a frequent solution. Existing methods such as mean-teacher, Cyclegan, and MCD will train models with the help of some easily accessible domains such as synthetic data. However, these methods are not guaranteed to operate well in real-world settings due to the domain shift. In this paper, we design a new unsupervised domain adaptation method named Multi-branch Adversarial Regressors (MarsDA) in hand pose estimation, where it could be better for feature migration. Specifically, we first generate pseudo-labels for unlabeled target domain data. Then, the new adversarial training loss between multiple regression branches we designed for hand pose estimation is introduced to narrow the domain gap. In this way, our model can reduce the noise of pseudo labels caused by the domain gap and improve the accuracy of pseudo labels. We evaluate our method on two publicly available real-world datasets, H3D and STB. Experimental results show that our method outperforms existing methods by a large margin.

中文翻译:

用于域自适应手部姿势估计的多分支对抗回归

尽管近年来手部姿态估计取得了巨大成功,但基于 RGB 的估计任务仍然存在挑战,其中最重要的是缺乏标记的训练数据。目前,合成数据集有大量标注准确的图像,但与现实世界数据集的差异影响了泛化性。因此,尝试将知识从标记的源域转移到未标记的目标域的迁移学习策略是一种常见的解决方案。现有的方法,如 mean-teacher、Cyclegan 和 MCD 将在一些易于访问的领域(如合成数据)的帮助下训练模型。但是,由于域转移,这些方法不能保证在现实环境中运行良好。在本文中,我们在手部姿态估计中设计了一种新的无监督域适应方法,称为多分支对抗回归器(MarsDA),它可以更好地进行特征迁移。具体来说,我们首先为未标记的目标域数据生成伪标签。然后,引入了我们为手姿势估计设计的多个回归分支之间的新对抗训练损失,以缩小域差距。通过这种方式,我们的模型可以减少由域间隙引起的伪标签噪声,提高伪标签的准确性。我们在两个公开可用的真实世界数据集 H3D 和 STB 上评估我们的方法。实验结果表明,我们的方法大大优于现有方法。我们首先为未标记的目标域数据生成伪标签。然后,引入了我们为手姿势估计设计的多个回归分支之间的新对抗训练损失,以缩小域差距。通过这种方式,我们的模型可以减少由域间隙引起的伪标签噪声,提高伪标签的准确性。我们在两个公开可用的真实世界数据集 H3D 和 STB 上评估我们的方法。实验结果表明,我们的方法大大优于现有方法。我们首先为未标记的目标域数据生成伪标签。然后,引入了我们为手姿势估计设计的多个回归分支之间的新对抗训练损失,以缩小域差距。通过这种方式,我们的模型可以减少由域间隙引起的伪标签噪声,提高伪标签的准确性。我们在两个公开可用的真实世界数据集 H3D 和 STB 上评估我们的方法。实验结果表明,我们的方法大大优于现有方法。我们在两个公开可用的真实世界数据集 H3D 和 STB 上评估我们的方法。实验结果表明,我们的方法大大优于现有方法。我们在两个公开可用的真实世界数据集 H3D 和 STB 上评估我们的方法。实验结果表明,我们的方法大大优于现有方法。
更新日期:2022-03-10
down
wechat
bug