Identify autism spectrum disorder via dynamic filter and deep spatiotemporal feature extraction

doi:10.1016/j.image.2021.116195

Signal Processing: Image Communication

Volume 94, May 2021, 116195

https://doi.org/10.1016/j.image.2021.116195 Get rights and content

Highlights

•
An effective model is proposed to identify children with autism spectrum disorder.
•
Dynamic filters are introduced to customize feature maps for each scanpath.
•
Field of view (FoV) maps are proposed to extract fixation-specific features.
•
A module based on scanpath and saliency prediction is proposed to generate spatiotemporal features.

Abstract

Early intervention and treatment are crucial for individuals with autism spectrum disorder (ASD). However, it is challenging to identify individuals with ASD at an early age, i.e. under 3 years old, due to the lack of an effective and objective identification method. The mainstream clinical diagnosis relies on long-term observation of children’s behaviors, which is time-consuming and expensive, and thus how to accurately and quickly distinguish children with ASD in early childhood has become a critical issue. In this paper, we propose an eye movement based model to identify children with ASD. Specifically, children are required to freely observe some images. At the same time, their eye movements are recorded to analyze. Both the observed image and eye movements are input into our model. The input data are processed by the embedding layer, dynamic filters and LSTM block, respectively. Eventually, the spatiotemporal features are extracted to identify the eye movements belonging to a child with ASD or a typically developed child. Experiments on the Saliency4ASD dataset demonstrate that the proposed model achieves state-of-the-art performance in identifying children with ASD.

Introduction

ASD is a heritable neurodevelopmental disorder across the lifespan [1]. Since people with ASD show delays in development and cannot take care of themselves, the disorder has a remarkably negative impact on the family. The currently accepted and effective treatment is to intervene at an early age [2], so how to accurately and quickly diagnose children with ASD in early childhood has become a critical issue. Its diagnosis, however, is quite difficult because of its complicated causes. At present, the diagnosis mainly depends on the subjective and time-consuming clinical judgment, e.g. observation [3] and interview [4], which is limited by scarce medical services and tends to be inconsistent. Thus there is an urgent requirement for an effective and objective method to identify individuals with ASD.

Researchers applied eye trackers to collect eye movement data of the subjects when viewing a set of images, and found that the gazed points (namely fixations) significantly differed between the neurodevelopmental disorder group and the typically developed group, which indicates that individuals with neurodevelopmental disorders have an atypical visual attention pattern [5], [6]. Individuals with ASD prefer idiosyncratic objects (e.g. metalwork and keyboards) and hand-related utensils (e.g. scissors and bottles) [7], [8], [9]. They, on the contrary, tend to avoid eye contact with humans and have reduced attention to faces [10], [11]. Consequently, eye movement data have been used as a biomarker to identify individuals with ASD [12], [13].

In recent years, the rapid development of deep learning has remarkably shaped many research directions, and also contributed to the improvement in some eye movement related researches. Since deep learning has an inevitable demand for data, some eye movement datasets have been built and released [14], [15], [16]. Besides, there are also eye movement datasets that focus on some specific populations, such as individuals with ASD [17], [18] and people of different ages [19]. The success of deep learning is also pushing ASD identification forward. For example, by using the deep learning based face detection models, the eye movement data on face areas is analyzed and exploited to identify individuals with ASD [20]. Similarly, the automatically predicted saliency values at fixation locations are exploited to predict the fixation sequence belonging to either an ASD or a typical development (TD) [21], [22]. Besides, the deep visual features, which are extracted by well-trained deep neural networks, can be combined with eye movement data to identify people with ASD [23], [24]. However, due to the inadequate mining of eye movement data characteristics and the insufficiency of combining eye movement data with deep features, the accuracy of identification methods remains to be improved.

The proposed model takes a scanpath that consists of a sequence of fixations and the viewed image as inputs, and then predicts the viewer as either an ASD or a TD. Specifically, the scanpath is used repeatedly to fully capture the visual behavior pattern of an observer and is combined with the visual feature maps extracted by a deep neural network (DNN) to better identify people with ASD. In summary, the contributions of this paper can be summarized in two-fold:

•
We propose to apply the dynamic filters to convolve with the deep visual feature maps, so as to transfer the universal feature maps, which are extracted from the observed image by pre-trained visual feature encoder, to sample-specific feature maps. The dynamic filter generator learns to process eye movement data to dynamic convolution kernels which can adjust responses in feature maps according to the eye movement.
•
We propose a module to effectively extract spatiotemporal features of scanpath via the adoption of field of view (FoV) maps. The spatial features are extracted based on the field of view of each fixation and deep feature maps, and the temporal features are processed based on the spatial features by using an LSTM (Long Short-Term Memory) block. The spatiotemporal features, which vary as the observer’s attention shifts, are discriminative for ASD identification.

The rest of this paper is organized as follows. The related works on ASD identification and the application of the dynamic filter are introduced in Section 2. The proposed model is elaborated in Section 3. Experiment setup and results analysis are presented in Section 4. Section 5 summarizes the conclusions.

Section snippets

Previous work

Saliency4ASD [25], organized at IEEE ICME’2019, is a grand challenge aiming at promoting research on visual attention of children with ASD. The challenge had two tracks: (1) predict visual attention maps, a.k.a. saliency maps, of individuals with ASD, and (2) identify children with ASD from typically developed children.

In the first track, we proposed a fully convolutional network that exploits multi-level feature maps to effectively predict the visual attention of children with ASD [26]. In [26]

Proposed model

As shown in Fig. 1, given the inputs including an image $I$ and a scanpath $P$ , the proposed model predicts the owner of the scanpath, either an ASD or a TD. First, a visual feature encoder is used to encode the image and obtain its visual feature maps. Second, the scanpath is reused in three modules, i.e. eye movement embedding, dynamic filters generator and FoV maps generator, to fully capture the visual behavior pattern of an observer. Specifically, in the first module, the intention of using

Experiments

In this section, we introduce the dataset, metrics and implementation details in Section 4.1. Then we conduct an ablation study to demonstrate the indispensability of each module in Section 4.2. Eventually, we compare our model with state-of-the-art models in Section 4.3.

Conclusions

We propose an image-level model to identify a scanpath belonging to either an individual with ASD or a typically developed child. First, the viewed image is encoded via a visual feature encoder. Then, the scanpath is utilized in three modules. The first module is the eye movement embedding layer, which computes the eye movement indicators and embeds them into a feature vector. The second module generates dynamic filters based on the scanpath. The dynamic filters transfer the universal visual

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported by the National Natural Science Foundation of China under Grant No. 61771301.

References (49)

ThaparA. et al.
Neurodevelopmental disorders
Lancet Psychiatr.
(2017)
WangS. et al.
Atypical visual saliency in autism spectrum disorder quantified through model-based eye tracking
Neuron
(2015)
ZallaT. et al.
Reduced saccadic inhibition of return to moving eyes in autism spectrum disorders
Vis. Res.
(2016)
BradshawJ. et al.
Feasibility and effectiveness of very early intervention for infants at-risk for autism spectrum disorder: A systematic review
J. Autism Dev. Disord.
(2015)
LordC. et al.
Austism diagnostic observation schedule: A standardized observation of communicative and social behavior
J. Autism Dev. Disord.
(1989)
LordC. et al.
Autism diagnostic interview-revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders
J. Autism Dev. Disord.
(1994)
SassonN.J. et al.
Visual attention to competing social and object images by preschool children with autism spectrum disorder
J. Autism Dev. Disord.
(2014)
SweeneyJ.A. et al.
Eye movements in neurodevelopmental disorders
Curr. Opin. Neurol.
(2004)
SassonN.J. et al.
Affective responses by adults with autism are reduced to social images but elevated to images related to circumscribed interests
PLoS One
(2012)
DuanH. et al.
Learning to predict where the children with asd look

BirminghamE. et al.

Comparing social attention in autism and amygdala lesions: Effects of stimulus and task condition

Soc. Neurosci.

(2011)

DuanH. et al.

Visual attention analysis and prediction on human faces for children with autism spectrum disorder

ACM Trans. Multimed. Comput. Commun. Appl.

(2019)

MuriasM. et al.

Validation of eye-tracking measures of social attention as a potential biomarker for autism clinical trials

Autism Res.

(2018)

FreedmanE.G. et al.

Eye movements, sensorimotor adaptation and cerebellar-dependent learning in autism: Toward potential biomarkers and subphenotypes

Eur. J. Neurosci.

(2018)

XuJ. et al.

Predicting human gaze beyond pixels

J. Vis.

(2014)

BorjiA. et al.

CAT2000: A large scale fixation dataset for boosting saliency research

(2015)

CheZ. et al.

How is gaze influenced by image transformations? Dataset and model

IEEE Trans. Image Process.

(2020)

H. Duan, G. Zhai, X. Min, Z. Che, Y. Fang, X. Yang, J. Gutiérrez, P. Le Callet, A dataset of eye movements for the...

CaretteR. et al.

Visualization of eye-tracking patterns in autism spectrum disorder: Method and dataset

BucherA. et al.

Age differences in emotion perception in a multiple target setting: An eye-tracking study

Emotion

(2019)

LiuW. et al.

Identifying children with autism spectrum disorder based on their face processing abnormality: A machine learning framework

Autism Res.

(2016)

StartsevM. et al.

Classifying autism spectrum disorder based on scanpaths and saliency

TaoY. et al.

SP-ASDNet: CNN-LSTM based ASD classification model using observer scanpaths

JiangM. et al.

Learning visual attention to identify people with autism spectrum disorder

IEEE Int. Conf. Comput. Vis.

(2017)

Cited by (6)

Deep learning with image-based autism spectrum disorder analysis: A systematic review
2024, Engineering Applications of Artificial Intelligence
Autism spectrum disorder (ASD) is a collection of neuro-developmental disorders associated with social, communicational, and behavioral difficulties. Early detection thereof is necessary to mitigate the adverse effects of this disorder by initiating special education in schools and rehabilitation centers. Two methods are available for diagnosing and rehabilitating ASD. The first is the manual method (i.e., an observation- or interview-based approach), in which the disorder is diagnosed through observation or by interviewing a parent or caregiver. This method is time-consuming, subjective, and mostly comprises the examination of behavioral symptoms. The other method involves automatic diagnosis using traditional machine learning (ML)- and modern deep learning (DL)-based approaches that rely on image analysis. Indeed, the amount of research literature concerned with the evaluation of the usefulness of DL-based methods that process images or video data to diagnose ASD to improve patients’ lives has increased significantly. This paper presents a systematic review of the DL-based approach involving the analysis of images or videos in autism research. The review covers studies that were published from 2017 to June 2023 and were indexed in PubMed, IEEE Xplore, ACM Digital Library, and Google Scholar. The results are reported according to the preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines. A total of 130 studies were included in the analysis. Eligible papers were categorized based on the different features extracted as input for the DL-based approach. Existing well-known public and private datasets that include images or videos for autism research are extensively reviewed and discussed in this systematic review. In addition, different rehabilitation strategies that have been shown to be highly beneficial for ASD individuals are included. Finally, various challenges presented by the automated detection, classification, and rehabilitation of ASD are discussed. The review concludes that the use of DL for the precise and affordable diagnosis of autism is increasing substantially. Our findings are expected to significantly benefit researchers, therapists, psychologists, and relevant stakeholders to advance ASD screening, monitoring, and diagnosis with the aid of a DL-based approach that entails image or video analysis.
Eye Tracking Biomarkers for Autism Spectrum Disorder Detection using Machine Learning and Deep Learning Techniques: Review
2023, Research in Autism Spectrum Disorders
Eye tracking is a promising tool for Autism Spectrum Disorder (ASD) detection in both children and adults. An important aspect of social communication is keeping eye contact, which is something that people with ASD frequently struggle with. Eye tracking can assess the duration of eye contact and the frequency and direction of gaze movements, offering quantifiable indicators of social communication deficits. People with ASD may also demonstrate other abnormalities in visual processing, such as an increased concentration on detail, sensory sensitivity, and trouble with complicated visual activities. These variations can be measured via Eye tracking, which offers critical information for the planning of therapy and diagnosis. The primary objective of this work is to provide a thorough description of the most recent studies that use Eye tracking combined with various Machine Learning (ML) and Deep Learning (DL) models for the detection of ASD. This will provide insights into the identification, and behavioral assessment, and distinguish between autistic people and those who are Typically Developing (TD). A detailed review of the various ML and DL models with their datasets and performance criteria is presented. Different types of eye movement datasets with diagnostic standards and eye tracker devices are also discussed. Finally, the study addresses the potential of gaze prediction in ASD patients for the design of interventions.
SpFormer: Spatio-Temporal Modeling for Scanpaths with Transformer
2024, Proceedings of the AAAI Conference on Artificial Intelligence
Using visual attention estimation on videos for automated prediction of autism spectrum disorder and symptom severity in preschool children
2024, PLoS ONE
ASDNet: A robust involution-based architecture for diagnosis of autism spectrum disorder utilising eye-tracking technology
2024, IET Computer Vision
EyeXplain Autism: Interactive System for Eye Tracking Data Analysis and Deep Neural Network Interpretation for Autism Spectrum Disorder Diagnosis
2021, Conference on Human Factors in Computing Systems - Proceedings

View full text

Identify autism spectrum disorder via dynamic filter and deep spatiotemporal feature extraction

Highlights

Abstract

Introduction

Section snippets

Previous work

Proposed model

Experiments

Conclusions

Declaration of Competing Interest

Acknowledgment

Lancet Psychiatr.

Neuron

Vis. Res.

Feasibility and effectiveness of very early intervention for infants at-risk for autism spectrum disorder: A systematic review

J. Autism Dev. Disord.

Austism diagnostic observation schedule: A standardized observation of communicative and social behavior

J. Autism Dev. Disord.

Autism diagnostic interview-revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders

J. Autism Dev. Disord.

Visual attention to competing social and object images by preschool children with autism spectrum disorder

J. Autism Dev. Disord.

Eye movements in neurodevelopmental disorders

Curr. Opin. Neurol.

Affective responses by adults with autism are reduced to social images but elevated to images related to circumscribed interests

PLoS One

Learning to predict where the children with asd look

Comparing social attention in autism and amygdala lesions: Effects of stimulus and task condition

Soc. Neurosci.

Visual attention analysis and prediction on human faces for children with autism spectrum disorder

ACM Trans. Multimed. Comput. Commun. Appl.

Validation of eye-tracking measures of social attention as a potential biomarker for autism clinical trials

Autism Res.

Eye movements, sensorimotor adaptation and cerebellar-dependent learning in autism: Toward potential biomarkers and subphenotypes

Eur. J. Neurosci.

Predicting human gaze beyond pixels

J. Vis.

CAT2000: A large scale fixation dataset for boosting saliency research

How is gaze influenced by image transformations? Dataset and model

IEEE Trans. Image Process.

Visualization of eye-tracking patterns in autism spectrum disorder: Method and dataset

Age differences in emotion perception in a multiple target setting: An eye-tracking study

Emotion

Identifying children with autism spectrum disorder based on their face processing abnormality: A machine learning framework

Autism Res.

Classifying autism spectrum disorder based on scanpaths and saliency

SP-ASDNet: CNN-LSTM based ASD classification model using observer scanpaths

Learning visual attention to identify people with autism spectrum disorder

IEEE Int. Conf. Comput. Vis.