Audio-Visual Particle Flow SMC-PHD Filtering for Multi-Speaker Tracking,IEEE Transactions on Multimedia

当前位置： X-MOL 学术 › IEEE Trans. Multimedia › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Audio-Visual Particle Flow SMC-PHD Filtering for Multi-Speaker Tracking
IEEE Transactions on Multimedia ( IF 8.4 ) Pub Date : 2020-04-01 , DOI: 10.1109/tmm.2019.2937185
Yang Liu , Volkan Kilic , Jian Guan , Wenwu Wang

Sequential Monte Carlo probability hypothesis density (SMC-PHD) filtering is a popular method used recently for audio-visual (AV) multi-speaker tracking. However, due to the weight degeneracy problem, the posterior distribution can be represented poorly by the estimated probability, when only a few particles are present around the peak of the likelihood density function. To address this issue, we propose a new framework where particle flow (PF) is used to migrate particles smoothly from the prior to the posterior probability density. We consider both zero and non-zero diffusion particle flows (ZPF/NPF), and developed two new algorithms, AV-ZPF-SMC-PHD and AV-NPF-SMC-PHD, where the speaker states from the previous frames are also considered for particle relocation. The proposed algorithms are compared systematically with several baseline tracking methods using the AV16.3, AVDIAR and CLEAR datasets, and are shown to offer improved tracking accuracy and average effective sample size (ESS).

中文翻译：

用于多扬声器跟踪的视听粒子流 SMC-PHD 过滤

顺序蒙特卡罗概率假设密度 (SMC-PHD) 过滤是最近用于视听 (AV) 多扬声器跟踪的流行方法。然而，由于权重退化问题，当似然密度函数的峰值周围仅存在少数粒子时，后验分布可能无法通过估计概率来表示。为了解决这个问题，我们提出了一个新的框架，其中粒子流 (PF) 用于从先验概率密度到后验概率密度平滑迁移粒子。我们考虑了零和非零扩散粒子流 (ZPF/NPF)，并开发了两种新算法 AV-ZPF-SMC-PHD 和 AV-NPF-SMC-PHD，其中还考虑了前一帧的说话人状态用于粒子重定位。

更新日期：2020-04-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11