Towards a speech therapy support system based on phonological processes early detection

https://doi.org/10.1016/j.csl.2020.101130Get rights and content

Highlights

  • We present an architecture for supporting speech therapy, showing its component modules and the methods applied.

  • We address the prediction of Phonological Processes as a way to early identification of speech disorders and clinical support.

  • We based our architecture on two theories (Case-Based Reasoning and Situation- Awareness) that have not been investigated in the related literature for identification of Phonological Processes.

  • We evaluated our situation-aware and case-based architecture through a large Phonological Knowledge Base and machine learning methods.

  • We demonstrated how the identification of errors in pronunciation allows clinical support through recommendations on therapeutic planning.

Abstract

Phonological disorders are characterized by substitutions, insertion and/or deletions of sounds during the process of language acquisition, which are known as Phonological Processes (PPs). In the speech therapy domain, an early identification of PPs allows the diagnosis and treatment of various pathologies and may improve clinical tasks, however, there are few proposals that focus on the identification of PPs for supporting Speech-Language Pathologists (SLPs). Recent research applied Case-Based Reasoning (CBR) in medical domain to identify specific cases related to patients. Situation-Awareness (SA) is a technique that allows computing systems to adapt itself and respond to users or other systems according to environment information. Moreover, there is no indicative in related literature of CBR and SA being used for detecting PPs that may occur in pronunciation. In this paper, we introduce the union of SA and CBR, tied to machine learning algorithms for proposing a system to predict PPs, supporting specialists in their clinical decisions. To evaluate the system, we implemented it in a software architecture prototype and evaluated the prototypes using a knowledge base containing near one hundred thousand audio files, collected from more than 1,000 pronunciation assessments. The evaluation of the prototypes showed an accuracy over 93% in the prediction of PPs, resulting in a efficient tool for clinical decision support and therapeutic planning. We also presented a direct qualitative comparison between our approach and related work.

Introduction

The speech acquisition is a complex process where the production of phonemes is characterized by continuities and discontinuities in the childs path toward mature production of the segments and structures of the ambient language (Rvachew and Bernhardt, 2010). Some error patterns in speech, also known as phonological processes (PPs), are typical in the language acquisition and occur when the child tries to adapt his/her speech to adult speech (Ceron et al., 2017). The presence of these error patterns may indicate disordered or deviant development (Dodd, 2013), in which children often experience difficulties with literacy, including decoding, reading comprehension, and the production of written text (Joye et al., 2019). In this context, the monitoring of PPs in pronunciation is vital for the identification of delayed or disordered phonological development in children, which compromise biological, psychological, and social factors of the individual (Martín-Ruiz et al., 2013).

Knowledge of phonological development has great significance in the clinical population to determine whether a child has a phonological impairment and needs intervention (Abou-Elsaad et al., 2019). However, several barriers are found in speech therapy, especially in developing countries: the lack of resources, rehabilitation centers and qualified personnel to provide intervention services, the work overload of professionals, and the poor development or adaptation of assistive technologies according to patients needs (Robles-Bykbaev et al., 2016). In order to face these difficulties, many studies have proposed computer-based speech therapy systems or virtual speech therapists (VSTs) for people with speech disorders (Chen et al., 2016). Also, the related literature presents studies focused on the development of speech recognition systems (Abad, Pompili, Costa, Trancoso, Fonseca, Leal, Farrajota, Martins, 2013, Grossinho, Guimaraes, Magalhaes, Cavaco, 2016, Caballero-Morales, Trujillo-Romero, 2014, Bolaños, Cole, Ward, Borts, Svirsky, 2011) for supporting SLPs in clinical environment.

Although there are many studies that focus on risk prevention and patient monitoring, there are few proposals in the speech therapy domain that use knowledge modeling to improve tasks such as diagnosis, therapy planning, and therapeutic intervention (Chuchuca-Méndez et al., 2016). Moreover, the proposals generally are not concerned with identifying and inferring error patterns in the speech. The identification of PPs in pronunciation may aid the Speech-Language Pathologists (SLPs) in extending traditional speech therapy, identifying relevant and recurrent patterns in the speech, and predicting behaviors according to these patterns. Consequently, it is possible to provide support to these professionals through optimization of therapeutic tasks, planning of therapy, and recommendation of necessary actions.

On these grounds, in this paper we present an architectural model that applies Situation-Awareness (SA) and Case-Based Reasoning (CBR) for automatically identifying PPs in children’s speech. SA has been recognized as an important resource in many different domains, and involves collecting contextual information about the environment, making decisions based on this collection, acting according to decisions, and gathering feedback from the environment for making better decisions in the future (Kokar and Endsley, 2012). CBR can also be a favorable choice in speech and health contexts, since this methodology has good learning capabilities, and its ability to solve problems improves as new cases are stored in history log files or in databases (Husain and Pheng, 2010).

According to a recent published systematic review (Franciscatto et al., 2018a), there is not yet a proposal in the literature that applies CBR and SA for automatically identifying Phonological Processes (PPs) and suggesting them to the SLPs. These theories applied together may represent powerful tools to indicate PPs with accurate precision, since the knowledge base is constantly updated and acquires capabilities over time as the diagnosed cases are confirmed.

Thus, the architectural model presented in this paper applies these theories in two main modules (Capture Module and Service Module) for automatically identifying PPs in children’s speech. The Capture Module is responsible for SA’s perception phase, through speech data collection (audio recording), while the Service Module is responsible for SA’s comprehension phase, through classification of pronunciation as correct or incorrect. In this module, the four activities of the CBR cycle (retrieval, reuse, review, and retain) are applied for prediction and recommendation of Phonological Processes, completing the SA’s projection level in the approach.

We evaluated our proposal with a Phonological Knowledge Base containing speech samples collected from 1,114 evaluations performed with 84 Portuguese words in the last three years. These target words were chosen to facilitate the identification of PPs, where each word has a score associated with the possible PPs for the acquisition of the Portuguese language. The results showed an average accuracy of 92.5% for classifying pronunciations as correct or incorrect, in concordance with the therapist evaluation. The pronunciation results were used for predicting the PPs; in this step, our approach achieved an accuracy over 93%, showing that it is possible to predict PPs for clinical decision support.

The present paper is structured as follows. In Section 2, the main concepts related to this research are presented, as well as the motivation of this work. In Section 3 we report our situation-aware and case-based approach for speech therapy support, while Section 4 shows the evaluation, results and discussion. Lastly, in Section 5 we present the conclusions and final remarks.

Section snippets

Background and motivation

This section presents the main concepts and related work covering knowledge areas involved in this study, including Situation-Awareness, situation-aware systems in health care and speech-language domain, Case-Based Reasoning and Phonological Processes. Also, the motivation for this work is presented, with regard to promising theories and unexplored research possibilities.

Situation-aware and case-based architecture for speech therapy support

As seen previously, traditional speech therapy presents some obstacles which include, mainly, the lack of specialists in the area and the difficulty of performing adequate patient monitoring. We believe that a situation-aware approach can mitigate these issues, thus the proposed software architecture aims to integrate aspects of the SA Model (Endsley, 1995) through Perception, Comprehension, and Projection levels. Moreover, the architecture includes the CBR methodology for expanding the

Evaluation and results

In this section, the evaluation details of the architecture and the results obtained in each component module are presented. For this purpose, a prototype that implements the architecture was developed.

Conclusions

In this work, a situation-aware and case-based architecture was proposed to assist Speech-Language Pathologists in tasks involving screening of speech disorders and therapeutic planning. Analyzing related work, we observed that the identification of phonological processes is little addressed; furthermore, to the submission of this paper, we could not find any work applying Situation-Awareness and Case-Based Reasoning for problem solving in the speech therapy domain. Considering that an early

Declaration of Competing Interest

No other relationships/conditions/circumstances that present a potential conflict of interest Relationships.

Acknowledgements

This work has been supported by Fundação de Amparo a Pesquisa do Estado do RS (FAPERGS), grant number 17/2551-0000875-8, Conselho Nacional de Desenvolvimento Científico e Tecnológico CNPq Brasil, grant number 423518/2018-6 and UFSM/FATEC through project 041250-9.07.0025(100548). This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES)) - Finance Code 001.

References (62)

  • A. Aamodt et al.

    Case-based reasoning: foundational issues, methodological variations, and system approaches

    AI Commun.

    (1994)
  • M.U. Ahmed et al.

    Case studies on the clinical applications using case-based reasoning

    Computer Science and Information Systems (FedCSIS), 2012 Federated Conference on

    (2012)
  • M.M. Allen

    Intervention efficacy and intensity for children with speech sound disorder

    J. Speech Lang. Hear. Res.

    (2013)
  • A. Azeta et al.

    A case-based reasoning approach for speech-enabled e-learning system

    2009 2nd International Conference on Adaptive Science & Technology (ICAST)

    (2009)
  • S. Begum et al.

    Case-based reasoning systems in the health sciences: a survey of recent trends and developments

    IEEE Trans. Syst. Man Cybern. Part C

    (2011)
  • D. Bolaños et al.

    Flora: Fluent oral reading assessment of children’s speech

    ACM Trans. Speech Lang. Process.

    (2011)
  • A.R. Brancalioni et al.

    Phonological disorders treatment effect with a stimulability and segment complexity strata model with speech intervention software (sifala)

    Revista CEFAC

    (2016)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • M.I. Ceron et al.

    Factors influencing consonant acquisition in brazilian portuguese–speaking children

    J. Speech Lang. Hear. Res.

    (2017)
  • F. Chuchuca-Méndez et al.

    An educative environment based on ontologies and e-learning for training on design of speech-language therapy plans for children with disabilities and communication disorders

    Ciencias de la Informática y Desarrollos de Investigación (CACIDI), IEEE Congreso Argentino de

    (2016)
  • B. Dodd

    Differential Diagnosis and Treatment of Children with Speech Disorder

    (2013)
  • B. Dodd

    Differential diagnosis of pediatric speech sound disorder

    Curr. Dev. Disord. Rep.

    (2014)
  • B. Dodd et al.

    Phonological disorders in children: Changes in phonological process use during treatment

    Int. J. Lang. Commun.Disord,

    (1989)
  • M.R. Endsley

    Toward a theory of situation awareness in dynamic systems

    Hum. Fact.

    (1995)
  • M.R. Endsley et al.

    Theoretical underpinnings of situation awareness: acritical review

    Situation Aware. Anal. Meas.

    (2000)
  • Y. Fan et al.

    A case-based reasoning approach for speech corpus generation

    International Conference on Natural Language Processing

    (2005)
  • M.H. Franciscatto et al.

    Situation awareness in the speech therapy domain: a systematic mapping study

    Comput. Speech Lang.

    (2018)
  • M.H. Franciscatto et al.

    A case-based system architecture based on situation-awareness for speech therapy

    Enterprise Information Systems (ICEIS), 20th International Conference on

    (2018)
  • M.H. Franciscatto et al.

    Blending situation awareness with machine learning to identify children’s speech disorders

    12th International Conference on Application of Information and Communication Technologies, AICT 2018, Almaty, Kazakhstan, October 17-19

    (2018)
  • E. Fringi et al.

    Evidence of phonological processes in automatic recognition of children’s speech

    Sixteenth Annual Conference of the International Speech Communication Association

    (2015)
  • M.R.L. Ghisleni et al.

    O uso das estratégias de reparo, considerando a gravidade do desvio fonológico evolutivo

    Revista CEFAC

    (2010)
  • Cited by (12)

    • Reference architecture design for computer-based speech therapy systems

      2023, Computer Speech and Language
      Citation Excerpt :

      Ochoa-Guaraca et al. (2016) and Redrovan-Reyes et al. (2019) proposed a general system architecture for their systems. Franciscatto et al. (2021) presented a more detailed software architecture. Detailed information on these studies are discussed in this section.

    • Progress prediction of Parkinson's disease based on graph wavelet transform and attention weighted random forest

      2022, Expert Systems with Applications
      Citation Excerpt :

      The main clinical manifestations of this disease are trembling, slow movement, postural balance disorder, and other motor characteristics. At the same time, PD patients will also be accompanied by symptoms such as speech disorders and dysphonia (Franciscatto et al., 2020), and with the continuous progression of PD, the speech ability of patients is declining (Karan et al., 2019). Fortunately, speech disorders are one of the earliest symptoms of PD, which can generally be observed five years before clinical diagnosis (Despotovic et al., 2020).

    • Detection of Vowel Errors in Children's Speech using Synthetic Phonetic Transcripts

      2023, 2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023
    View all citing articles on Scopus
    View full text