Hierarchical Bayesian LSTM for Head Trajectory Prediction on Omnidirectional Images,IEEE Transactions on Pattern Analysis and Machine Intelligence

当前位置： X-MOL 学术 › IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Hierarchical Bayesian LSTM for Head Trajectory Prediction on Omnidirectional Images
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 2021-10-02 , DOI: 10.1109/tpami.2021.3117019
Li Yang ₁ , Mai Xu ₁ , Yichen Guo ₁ , Xin Deng ₂ , Fangyuan Gao ₂ , Zhenyu Guan ₂

Affiliation

When viewing omnidirectional images (ODIs), viewers can access different viewports via head movement (HM), which sequentially forms head trajectories in spatial-temporal domain. Thus, head trajectories play a key role in modeling human attention on ODIs. In this paper, we establish a large-scale dataset collecting 21,600 head trajectories on 1,080 ODIs. By mining our dataset, we find two important factors influencing head trajectories, i.e., temporal dependency and subject-specific variance. Accordingly, we propose a novel approach integrating hierarchical Bayesian inference into long short-term memory (LSTM) network for head trajectory prediction on ODIs, which is called HiBayes-LSTM. In HiBayes-LSTM, we develop a mechanism of Future Intention Estimation (FIE), which captures the temporal correlations from previous, current and estimated future information, for predicting viewport transition. Additionally, a training scheme called Hierarchical Bayesian inference (HBI) is developed for modeling inter-subject uncertainty in HiBayes-LSTM. For HBI, we introduce a joint Gaussian distribution in a hierarchy, to approximate the posterior distribution over network weights. By sampling subject-specific weights from the approximated posterior distribution, our HiBayes-LSTM approach can yield diverse viewport transition among different subjects and obtain multiple head trajectories. Extensive experiments validate that our HiBayes-LSTM approach significantly outperforms 9 state-of-the-art approaches for trajectory prediction on ODIs, and then it is successfully applied to predict saliency on ODIs.

中文翻译：

用于全向图像头部轨迹预测的分层贝叶斯 LSTM

当观看全向图像（ODI）时，观看者可以通过头部运动（HM）访问不同的视口，在时空域中顺序形成头部轨迹。因此，头部轨迹在模拟人类对 ODI 的注意力方面发挥着关键作用。在本文中，我们建立了一个大规模数据集，收集 1,080 个 ODI 上的 21,600 个头部轨迹。通过挖掘我们的数据集，我们发现影响头部轨迹的两个重要因素，即时间依赖性和特定于主题的方差。因此，我们提出了一种将分层贝叶斯推理集成到长短期记忆 (LSTM) 网络中的新方法，用于 ODI 上的头部轨迹预测，称为 HiBayes-LSTM。在 HiBayes-LSTM 中，我们开发了一种未来意图估计（FIE）机制，它捕获先前、当前和估计的未来信息的时间相关性，用于预测视口转换。此外，还开发了一种称为分层贝叶斯推理 (HBI) 的训练方案，用于对 HiBayes-LSTM 中的主体间不确定性进行建模。对于 HBI，我们在层次结构中引入联合高斯分布，以近似网络权重的后验分布。通过从近似后验分布中采样特定于受试者的权重，我们的 HiBayes-LSTM 方法可以在不同受试者之间产生不同的视口过渡，并获得多个头部轨迹。大量实验验证了我们的 HiBayes-LSTM 方法显着优于 9 种最先进的 ODI 轨迹预测方法，并成功应用于预测 ODI 的显着性。

更新日期：2021-10-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11