Elsevier

Applied Soft Computing

Volume 96, November 2020, 106653
Applied Soft Computing

Visual fixation prediction with incomplete attention map based on brain storm optimization

https://doi.org/10.1016/j.asoc.2020.106653Get rights and content

Highlights

  • The proposed model calculates the attention regions in a partially random manner.

  • The outputs of the proposed model are incomplete saliency maps.

  • Brain Storm Optimization algorithm is applied to guide the searching process.

  • Eight fixation prediction models and 24 salient region detection models are compared.

  • Results show that the proposed method is effective to predict the fixation rapidly.

Abstract

We cannot see everything around us. Instead, the visual attention mechanism will select some fixations from extensive visual information for further processing. Many computational attention models have been proposed by imitating this mechanism. However, almost all of the state-of-the-art computational models output a complete saliency map, which means one needs to go through all the regions in a scene before figuring out which part is more salient. According to the findings of neuroscience researches, it is unnecessary to evaluate every part at the glance stage of visual perception. Many researchers believe that the attention maps in different parts of our brain should be an incomplete one. What illustrated in this paper is a visual fixation prediction model that calculates the attention regions in a partially random manner. The output saliency map indeed is incomplete. We first translate the fixation prediction problem to a 2-D searching problem, then apply a newly proposed swarm intelligence algorithm known as Brain Storm Optimization (BSO) to search the fixation in different scenes. The proposed method can guide the searching process, converging to relatively more salient positions during iterations without going through all parts of an image. We evaluate the proposed method on a large scale CAT2000 dataset and the Extended Complex Scene Saliency Dataset (ECSSD). By comparatively studying with the other eight fixation prediction models and 24 salient region detection models, results indicate that the proposed method is effective in predicting the fixation rapidly, which makes it a good candidate for computational visual attention modeling.

Introduction

In the human visual perception system, there is far more information reaching onto our retinas than our cognitive systems at one time [1]. The optical system has evolved a remarkable visual attention mechanism which can select a fixation from extensive visual information for further processing [2]. It is often available at the early stage of visual perception process of spatial representation and information coding. As a result, the human optical system can deal with complex dynamic scenes and can indeed fast select the most salient cues. Many computational attention models have been proposed by imitating this mechanism [3]. It also had been proven that those computational models have the potential to be applied to applications including object detection and recognition [4], video compression [5], target tracking [6] as well as autonomous systems such as self-driving cars [7], [8] and adaptive robotic systems [9], [10]. The output of those models is a gray saliency map where the intensity of each pixel represents the likelihood that it belongs to a salient object [11].

However, the output of almost all of the state-of-the-art computational models are complete saliency maps, which means one needs to go through all the regions in a scene before figuring out which part is more salient, and which part is contrary to reality. We cannot see everything around us. That is why we need the attention mechanism to not process everything in a scene by way of image processing. Conversely, we process the information in a partially random manner. If there is a saliency map in our brain, it must be an incomplete one. Furthermore, some researchers believe that there is only one attention spotlight in our visual field, i.e., the attention mechanism selects a single region in a scene while neglecting other regions in the visual field. However, more recent work showed that visual attention might select a small number of objects simultaneously while ignoring others, which is called multi-focal visual attention [12]. To overcome the shortcomings of the traversal algorithms in this field, in this paper, we propose an incomplete attention map based multi-focal visual fixation prediction model, which borrows the principle from a newly proposed swarm intelligence algorithm, i.e., Brain Storm Optimization (BSO) algorithm, to search and locate the visual fixations rapidly in a random manner with an incomplete saliency map.

The proposed method is rooted in the discoveries in recent neuroscience and inspired by the process of building visual attention model benchmarks. According to neuroscience research, it is impossible to evaluate every part in a scene to figure out which part is more salient at the glance stage of visual perception, so the proposed incomplete attention map is more in line with the characteristics of the biological discoveries. BSO is a swarm intelligence algorithm inspired by the procedure of the human brainstorming process, which is often used in human activities to find optimal solutions for a problem in hand. This algorithm can guide the searching process converge to optimal solutions over iterations. Also, it has been verified successfully in many applications such as electric power systems [13], optimal control problems [14] as well as the wireless sensor networks [15]. By using the convergent and divergent operations, individuals in BSO are grouped and diverged in the search space [16]. The search performance could be benefited from this inherent advantage, which makes it a potential method for solving the multi-focal attention searching problems.

Therefore, the main contribution of this paper is a newneuroscience-inspired, incomplete multi-focal attention mapbased fixation prediction model with the tool of a meta-heuristic algorithm BSO. Unlike other optimization-based methods, the BSO algorithm is not used to optimize the parameters but directly applied to the corresponding image to search for salient areas. Starting from an image scene segmented into superpixels, we initialize several solution positions randomly in the superpixel space. By applying a modified BSO algorithm in this space, a few numbers of attention regions will pop-up. Then taking those regions as growing seeds, an incomplete multi-focal saliency map will be obtained. Since the BSO method converges to better results during each iteration, not all the parts of the image will need to be explored. The testing results on the benchmark datasets show the effectiveness of the proposed method. Also, the results of comparative research with other state-of-the-art fixation prediction methods are also given. The experiments indicate that the proposed method can predict the multi-focal attention. Moreover, since we do not go through all the parts of a visual scene, the computational costs are inexpensive comparing to other methods, making it a good candidate model for the rapid fixation prediction.

The remainder of this article is arranged as follows. Section 2 offers the related works, from both the neuroscience and computer vision points of view. Section 3 gives the visual attention problem statement. The proposed visual fixation prediction method based on BSO is introduced in Section 4. Section 5 presents the comparative results with discussions. The conclusion and future works are summarized in Section 6.

Section snippets

Related works

From the Neuroscience point of view, there are numerous works proposed to deal with the selective visual attention mechanisms. Those works often use technologies such as eye-tracking and functional magnetic resonance imaging (fMRI) to evaluate the visually-driven neuronal responses in the brain [17]. It has been proven that the deployment of visual attention occurs either by physical stimulus from the surroundings or according to internal and behavioral goals. Those two kinds of attention thus

Problem statement

From the analysis above, the purpose of visual fixation or attention determination is to find out the most salient region(s) in a particular scene. In this paper, we propose to fast locate those regions by translating it into a salient area searching task. Searching the best available solution(s) for a given problem is formulated as an optimization problem. In swarm intelligence algorithms, each individual in the swarm expresses a solution in the search space. According to collective behaviors

Brain Storm Optimization in objective space

Brain Storm Optimization algorithm is a relatively new and promising algorithm in swarm intelligence family. It was inspired by one of human beings collective problem-solving skills, i.e., the brainstorming process. The algorithm can obtain a good enough optimum for a problem defined in a searching space. There are three operations in this algorithm: the solution clustering, new individual generation, and selection [50]. The original BSO clusters the solutions into several categories, and

Datasets

In order to test the proposed fixation prediction method in natural scenes, we have tested the proposed approach on two datasets: the CAT2000 [56] from the MIT saliency benchmark [57], [58] and the Extended Complex Scene Saliency Dataset (ECSSD) provided by Yan et al. [59].

The CAT2000 dataset containing 2000 images with eye-tracking data from 24 observers. There are 20 categories including (1) Action, (2) Affective, (3) Art, (4) Black & White, (5) Cartoon, (6) Fractal, (7) Indoor, (8) Inverted,

Parameter sensitivity

To get a better knowledge of the impact of the BSO parameters in the proposed method, we performed massive tests for the sensitivity of BSO key parameters, with perce taking 10% and 20%, pe taking 0.2, 0.3, and 0.5, and pone taking 0.3, 0.5, and 0.8, respectively. The combinations of the above three parameters are shown in Table 3. For each of the above parameters, we conducted 10 independent tests on the ECSSD data set. With reference to the same evaluation criteria mentioned above, the

Conclusion

Based on a swarm intelligence optimization algorithm (BSO), this paper introduced a visual fixation prediction model with incomplete attention map, which was inspired by the human optical system mechanism from the neuroscience researches. We cannot see everything around us, and at the early stage of visual perception, one cannot evaluate every part in a scene to determine which area is more salient, so the proposed approach has been performed in a partially, randomly way. The BSO algorithm has

CRediT authorship contribution statement

Jian Yang: Conceptualization, Methodology, Software, Formal analysis, Writing - original draft, Writing - review & editing. Yang Shen: Conceptualization, Methodology, Software, Visualization, Writing - review & editing. Yuhui Shi: Conceptualization, Methodology, Writing - review & editing, Supervision, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is partially supported by National Key R&D Program of China under the Grant No. 2017YFC0804003, National Science Foundation of China under grant number 61761136008, Shenzhen Peacock Plan, China under Grant No. KQTD2016112514355531, Program for Guangdong Introducing Innovative and Entrepreneurial Teams China under grant number 2017ZT07X386, and the Science and Technology Innovation Committee Foundation of Shenzhen China under the Grant No. ZDSYS201703031748284, and Guangdong Provincial

References (79)

  • BorjiA. et al.

    State-of-the-art in visual attention modeling

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2012)
  • IttiL.

    Automatic foveation for video compression using a neurobiological model of visual attention

    IEEE Trans. Image Process.

    (2004)
  • FrintropS. et al.

    Most salient region tracking

  • ChenS. et al.

    Brain-inspired cognitive model with attention for self-driving cars

    IEEE Trans. Cogn. Dev. Syst.

    (2017)
  • DangT. et al.

    Visual saliency-aware receding horizon autonomous exploration with application to aerial robotics

  • AnantrasirichaiN. et al.

    Fixation prediction and visual priority maps for biped locomotion

    IEEE Trans. Cybern.

    (2018)
  • BorjiA. et al.

    Salient object detection: A benchmark

    IEEE Trans. Image Process.

    (2015)
  • JordehiA.R.

    Brainstorm optimisation algorithm (BSOA): An efficient algorithm for finding optimal location and setting of FACTS devices in electric power systems

    Int. J. Electr. Power Energy Syst.

    (2015)
  • ChenJ. et al.

    Enhanced brain storm optimization algorithm for wireless sensor networks deployment

  • ChengS. et al.

    Brain storm optimization algorithm: A review

    Artif. Intell. Rev.

    (2016)
  • O’ConnellT.P. et al.

    Predicting eye movement patterns from fMRI responses to natural scenes

    Nature Commun.

    (2018)
  • WolfeJ.M. et al.

    Guided search: an alternative to the feature integration model for visual search

    J. Exp. Psychol. [Hum. Percept.]

    (1989)
  • WangF. et al.

    Modulation of neuronal responses by exogenous attention in macaque primary visual cortex

    J. Neurosci.

    (2015)
  • SomersD.C. et al.

    Attention maps in the brain

    Wiley Interdiscip. Rev. Cogn. Sci.

    (2013)
  • LyyraP. et al.

    Look at them and they will notice you: Distractor-independent attentional capture by direct gaze in change blindness

    Vis. Cogn.

    (2018)
  • IttiL. et al.

    A model of saliency-based visual attention for rapid scene analysis

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1998)
  • FrintropS.

    VOCUS: A Visual Attention System for Object Detection and Goal-Directed Search, Vol. 3899

    (2006)
  • YangJ. et al.

    Multiresolution saliency map based object segmentation

    J. Electron. Imaging

    (2015)
  • GaoD. et al.

    Discriminant saliency for visual recognition from cluttered scenes

  • TorralbaA.

    Modeling global scene factors in attention

    J. Opt. Soc. Amer. A

    (2003)
  • HarelJ. et al.

    Graph-based visual saliency

  • C. Yang, L. Zhang, H. Lu, X. Ruan, M.-H. Yang, Saliency detection via graph-based manifold ranking, in: Proceedings of...
  • RosenholtzR. et al.

    The effect of background color on asymmetries in color search

    J. Vis.

    (2004)
  • BruceN. et al.

    Saliency based on information maximization

  • HouX. et al.

    Saliency detection: A spectral residual approach

  • AchantaR. et al.

    Frequency-tuned salient region detection

  • RamströmO. et al.

    Visual attention using game theory

  • ChengM.-M. et al.

    Global contrast based salient region detection

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2015)
  • KienzleW. et al.

    Center-surround patterns emerge as optimal predictors for human saccade targets

    J. Vis.

    (2009)
  • Cited by (13)

    • Automatic synthesizing multi-robot cooperation strategies based on Brain Storm Robotics

      2022, Applied Soft Computing
      Citation Excerpt :

      As a relatively new algorithm in this family, the Brain Storm Optimization (BSO) algorithm has been proposed [25] and widely developed in recent years [26–29]. It has been applied to solve many theoretical and real-world optimization problems [30–33]. It is inspired by the brainstorming process in human society, which can be executed with the following steps [25]: (1) gathering a group of people with different backgrounds. (

    • Automatically interactive group VIKOR decision making mechanism based on BSO-SNA

      2021, Applied Soft Computing
      Citation Excerpt :

      Driven by the above objectives, the Brain Store Optimization (BSO) algorithm is introduced to depict the interaction process. Inspired by the thinking collision process between people, Shi [50] proposed the BSO algorithm, which is widely used in the application (see in [51–53]). Generally, the BSO algorithm contains two parts: subgroup structure identification and preference iteration.

    • Solving Vehicle Routing Problem with Drones Based on a Bi-level Heuristic Approach

      2022, Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
    View all citing articles on Scopus
    View full text