Gravitational Models Explain Shifts on Human Visual Attention,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Gravitational Models Explain Shifts on Human Visual Attention
arXiv - CS - Artificial Intelligence Pub Date : 2020-09-15 , DOI: arxiv-2009.06963
Dario Zanca, Marco Gori, Stefano Melacci, Alessandra Rufa

Visual attention refers to the human brain's ability to select relevant sensory information for preferential processing, improving performance in visual and cognitive tasks. It proceeds in two phases. One in which visual feature maps are acquired and processed in parallel. Another where the information from these maps is merged in order to select a single location to be attended for further and more complex computations and reasoning. Its computational description is challenging, especially if the temporal dynamics of the process are taken into account. Numerous methods to estimate saliency have been proposed in the last three decades. They achieve almost perfect performance in estimating saliency at the pixel level, but the way they generate shifts in visual attention fully depends on winner-take-all (WTA) circuitry. WTA is implemented} by the biological hardware in order to select a location with maximum saliency, towards which to direct overt attention. In this paper we propose a gravitational model (GRAV) to describe the attentional shifts. Every single feature acts as an attractor and {the shifts are the result of the joint effects of the attractors. In the current framework, the assumption of a single, centralized saliency map is no longer necessary, though still plausible. Quantitative results on two large image datasets show that this model predicts shifts more accurately than winner-take-all.

中文翻译：

引力模型解释了人类视觉注意力的转变

视觉注意力是指人脑选择相关感官信息进行优先处理、提高视觉和认知任务表现的能力。它分两个阶段进行。一种并行获取和处理视觉特征图的方法。另一个是合并来自这些地图的信息，以便选择一个单独的位置进行进一步和更复杂的计算和推理。它的计算描述具有挑战性，特别是如果考虑到过程的时间动态。在过去的三年中，已经提出了许多估计显着性的方法。它们在估计像素级别的显着性方面取得了近乎完美的性能，但它们产生视觉注意力转移的方式完全取决于赢家通吃 (WTA) 电路。WTA 由生物硬件实现}，以选择具有最大显着性的位置，以引导明显的注意力。在本文中，我们提出了一个引力模型（GRAV）来描述注意力转移。每一个特征都充当一个吸引子，{这些变化是吸引子共同作用的结果。在当前框架中，不再需要假设一个单一的、集中的显着图，尽管它仍然是合理的。两个大型图像数据集的定量结果表明，该模型比赢家通吃更准确地预测变化。每一个特征都充当一个吸引子，{这些变化是吸引子共同作用的结果。在当前框架中，不再需要假设一个单一的、集中的显着图，尽管它仍然是合理的。两个大型图像数据集的定量结果表明，该模型比赢家通吃更准确地预测变化。每一个特征都充当一个吸引子，{这些变化是吸引子共同作用的结果。在当前框架中，不再需要假设一个单一的、集中的显着图，尽管它仍然是合理的。两个大型图像数据集的定量结果表明，该模型比赢家通吃更准确地预测变化。

更新日期：2020-10-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>