当前位置: X-MOL 学术bioRxiv. Anim. Behav. Cognit. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Neural Network Models of Object Recognition Exhibit Human-like Limitations When Performing Visual Search Tasks
bioRxiv - Animal Behavior and Cognition Pub Date : 2021-01-12 , DOI: 10.1101/2020.10.26.354258
David A. Nicholson , Astrid A. Prinz

To find an object we are looking for, we must recognize it. Prevailing models of visual search neglect recognition, focusing instead on selective attention mechanisms. These models account for performance limitations that participants exhibit when searching highly simplified stimuli often used in laboratory tasks. However, it is unclear how to apply these models to complex natural images of real-world objects. Deep neural networks (DNN) can be applied to any image, and recently have emerged as state-of-the-art models of object recognition in the primate ventral visual pathway. Using these DNN models, we ask whether object recognition explains limitations on performance across visual search tasks. First, we show that DNNs exhibit a hallmark effect seen when participants search simplified stimuli. Further experiments show this effect results from optimizing for object recognition: DNNs trained from randomly-initialized weights do not exhibit the same performance limitations. Next, we test DNN models of object recognition with natural images, using a dataset where each image has a visual search difficulty score, derived from human reaction times. We find DNN accuracy is inversely correlated with visual search difficulty score. Our findings suggest that to a large extent visual search performance is explained by object recognition.

中文翻译:

执行视觉搜索任务时,对象识别的深度神经网络模型表现出类似人的局限性

要找到我们正在寻找的对象,我们必须识别它。视觉搜索的流行模型忽略了识别,而是专注于选择性注意机制。这些模型解决了参与者在搜索实验室任务中经常使用的高度简化的刺激时表现出的性能限制。但是,尚不清楚如何将这些模型应用于现实世界对象的复杂自然图像。深度神经网络(DNN)可以应用于任何图像,并且最近已成为灵长类动物腹侧视觉通路中对象识别的最新模型。使用这些DNN模型,我们询问对象识别是否解释了视觉搜索任务的性能限制。首先,我们表明,当参与者搜索简化刺激时,DNN表现出标志性效果。进一步的实验表明,这种效果来自优化对象识别:从随机初始化的权重训练的DNN并没有表现出相同的性能局限性。接下来,我们使用一个数据集测试具有自然图像的DNN模型,其中每个图像都有一个从人类反应时间得出的视觉搜索难度评分。我们发现DNN的准确性与视觉搜索难度分数成反比。我们的发现表明,视觉搜索性能在很大程度上是由对象识别来解释的。我们发现DNN的准确性与视觉搜索难度分数成反比。我们的发现表明,视觉搜索性能在很大程度上是由对象识别来解释的。我们发现DNN的准确性与视觉搜索难度分数成反比。我们的发现表明,视觉搜索性能在很大程度上是由对象识别来解释的。
更新日期:2021-01-13
down
wechat
bug