Towards a speech therapy support system based on phonological processes early detection
Introduction
The speech acquisition is a complex process where the production of phonemes is characterized by continuities and discontinuities in the childs path toward mature production of the segments and structures of the ambient language (Rvachew and Bernhardt, 2010). Some error patterns in speech, also known as phonological processes (PPs), are typical in the language acquisition and occur when the child tries to adapt his/her speech to adult speech (Ceron et al., 2017). The presence of these error patterns may indicate disordered or deviant development (Dodd, 2013), in which children often experience difficulties with literacy, including decoding, reading comprehension, and the production of written text (Joye et al., 2019). In this context, the monitoring of PPs in pronunciation is vital for the identification of delayed or disordered phonological development in children, which compromise biological, psychological, and social factors of the individual (Martín-Ruiz et al., 2013).
Knowledge of phonological development has great significance in the clinical population to determine whether a child has a phonological impairment and needs intervention (Abou-Elsaad et al., 2019). However, several barriers are found in speech therapy, especially in developing countries: the lack of resources, rehabilitation centers and qualified personnel to provide intervention services, the work overload of professionals, and the poor development or adaptation of assistive technologies according to patients needs (Robles-Bykbaev et al., 2016). In order to face these difficulties, many studies have proposed computer-based speech therapy systems or virtual speech therapists (VSTs) for people with speech disorders (Chen et al., 2016). Also, the related literature presents studies focused on the development of speech recognition systems (Abad, Pompili, Costa, Trancoso, Fonseca, Leal, Farrajota, Martins, 2013, Grossinho, Guimaraes, Magalhaes, Cavaco, 2016, Caballero-Morales, Trujillo-Romero, 2014, Bolaños, Cole, Ward, Borts, Svirsky, 2011) for supporting SLPs in clinical environment.
Although there are many studies that focus on risk prevention and patient monitoring, there are few proposals in the speech therapy domain that use knowledge modeling to improve tasks such as diagnosis, therapy planning, and therapeutic intervention (Chuchuca-Méndez et al., 2016). Moreover, the proposals generally are not concerned with identifying and inferring error patterns in the speech. The identification of PPs in pronunciation may aid the Speech-Language Pathologists (SLPs) in extending traditional speech therapy, identifying relevant and recurrent patterns in the speech, and predicting behaviors according to these patterns. Consequently, it is possible to provide support to these professionals through optimization of therapeutic tasks, planning of therapy, and recommendation of necessary actions.
On these grounds, in this paper we present an architectural model that applies Situation-Awareness (SA) and Case-Based Reasoning (CBR) for automatically identifying PPs in children’s speech. SA has been recognized as an important resource in many different domains, and involves collecting contextual information about the environment, making decisions based on this collection, acting according to decisions, and gathering feedback from the environment for making better decisions in the future (Kokar and Endsley, 2012). CBR can also be a favorable choice in speech and health contexts, since this methodology has good learning capabilities, and its ability to solve problems improves as new cases are stored in history log files or in databases (Husain and Pheng, 2010).
According to a recent published systematic review (Franciscatto et al., 2018a), there is not yet a proposal in the literature that applies CBR and SA for automatically identifying Phonological Processes (PPs) and suggesting them to the SLPs. These theories applied together may represent powerful tools to indicate PPs with accurate precision, since the knowledge base is constantly updated and acquires capabilities over time as the diagnosed cases are confirmed.
Thus, the architectural model presented in this paper applies these theories in two main modules (Capture Module and Service Module) for automatically identifying PPs in children’s speech. The Capture Module is responsible for SA’s perception phase, through speech data collection (audio recording), while the Service Module is responsible for SA’s comprehension phase, through classification of pronunciation as correct or incorrect. In this module, the four activities of the CBR cycle (retrieval, reuse, review, and retain) are applied for prediction and recommendation of Phonological Processes, completing the SA’s projection level in the approach.
We evaluated our proposal with a Phonological Knowledge Base containing speech samples collected from 1,114 evaluations performed with 84 Portuguese words in the last three years. These target words were chosen to facilitate the identification of PPs, where each word has a score associated with the possible PPs for the acquisition of the Portuguese language. The results showed an average accuracy of 92.5% for classifying pronunciations as correct or incorrect, in concordance with the therapist evaluation. The pronunciation results were used for predicting the PPs; in this step, our approach achieved an accuracy over 93%, showing that it is possible to predict PPs for clinical decision support.
The present paper is structured as follows. In Section 2, the main concepts related to this research are presented, as well as the motivation of this work. In Section 3 we report our situation-aware and case-based approach for speech therapy support, while Section 4 shows the evaluation, results and discussion. Lastly, in Section 5 we present the conclusions and final remarks.
Section snippets
Background and motivation
This section presents the main concepts and related work covering knowledge areas involved in this study, including Situation-Awareness, situation-aware systems in health care and speech-language domain, Case-Based Reasoning and Phonological Processes. Also, the motivation for this work is presented, with regard to promising theories and unexplored research possibilities.
Situation-aware and case-based architecture for speech therapy support
As seen previously, traditional speech therapy presents some obstacles which include, mainly, the lack of specialists in the area and the difficulty of performing adequate patient monitoring. We believe that a situation-aware approach can mitigate these issues, thus the proposed software architecture aims to integrate aspects of the SA Model (Endsley, 1995) through Perception, Comprehension, and Projection levels. Moreover, the architecture includes the CBR methodology for expanding the
Evaluation and results
In this section, the evaluation details of the architecture and the results obtained in each component module are presented. For this purpose, a prototype that implements the architecture was developed.
Conclusions
In this work, a situation-aware and case-based architecture was proposed to assist Speech-Language Pathologists in tasks involving screening of speech disorders and therapeutic planning. Analyzing related work, we observed that the identification of phonological processes is little addressed; furthermore, to the submission of this paper, we could not find any work applying Situation-Awareness and Case-Based Reasoning for problem solving in the speech therapy domain. Considering that an early
Declaration of Competing Interest
No other relationships/conditions/circumstances that present a potential conflict of interest Relationships.
Acknowledgements
This work has been supported by Fundação de Amparo a Pesquisa do Estado do RS (FAPERGS), grant number 17/2551-0000875-8, Conselho Nacional de Desenvolvimento Científico e Tecnológico CNPq Brasil, grant number 423518/2018-6 and UFSM/FATEC through project 041250-9.07.0025(100548). This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES)) - Finance Code 001.
References (62)
- et al.
Automatic word naming recognition for an on-line aphasia treatment system
Comput. Speech I Lang.
(2013) - et al.
Identification of phonological processes in arabic–speaking egyptian children by single-word test
J. Commun. disord.
(2019) - et al.
Evolutionary approach for integration of multiple pronunciation patterns for enhancement of dysarthric speech recognition
Expert Syst. Appl.
(2014) - et al.
Systematic review of virtual speech therapists for speech disorders
Comput. Speech Lang.
(2016) Problem solving and reasoning: case-based
International Encyclopedia of the Social I& Behavioral Sciences (Second Edition)
(2015)- et al.
Cognitive mechanisms underlying reading and spelling development in five European orthographies
Learn. Instruct.
(2014) - et al.
Computerized assessment of phonological processes in malayalam (capp-m)
Global Humanitarian Technology Conference: South Asia Satellite (GHTC-SAS), 2013 IEEE
(2013) - et al.
Processes and intelligibility in disordered phonology
Clin.Linguist. Phonet.
(1988) - et al.
Ml-knn: a lazy learning approach to multi-label learning
Pattern Recognit.
(2007) Solving large scale linear prediction problems using stochastic gradient descent algorithms
ICML 2004: Proceedings of the 21st International Conference on Machine Learning. Omnipress
(2004)
Case-based reasoning: foundational issues, methodological variations, and system approaches
AI Commun.
Case studies on the clinical applications using case-based reasoning
Computer Science and Information Systems (FedCSIS), 2012 Federated Conference on
Intervention efficacy and intensity for children with speech sound disorder
J. Speech Lang. Hear. Res.
A case-based reasoning approach for speech-enabled e-learning system
2009 2nd International Conference on Adaptive Science & Technology (ICAST)
Case-based reasoning systems in the health sciences: a survey of recent trends and developments
IEEE Trans. Syst. Man Cybern. Part C
Flora: Fluent oral reading assessment of children’s speech
ACM Trans. Speech Lang. Process.
Phonological disorders treatment effect with a stimulability and segment complexity strata model with speech intervention software (sifala)
Revista CEFAC
Random forests
Mach. Learn.
Factors influencing consonant acquisition in brazilian portuguese–speaking children
J. Speech Lang. Hear. Res.
An educative environment based on ontologies and e-learning for training on design of speech-language therapy plans for children with disabilities and communication disorders
Ciencias de la Informática y Desarrollos de Investigación (CACIDI), IEEE Congreso Argentino de
Differential Diagnosis and Treatment of Children with Speech Disorder
Differential diagnosis of pediatric speech sound disorder
Curr. Dev. Disord. Rep.
Phonological disorders in children: Changes in phonological process use during treatment
Int. J. Lang. Commun.Disord,
Toward a theory of situation awareness in dynamic systems
Hum. Fact.
Theoretical underpinnings of situation awareness: acritical review
Situation Aware. Anal. Meas.
A case-based reasoning approach for speech corpus generation
International Conference on Natural Language Processing
Situation awareness in the speech therapy domain: a systematic mapping study
Comput. Speech Lang.
A case-based system architecture based on situation-awareness for speech therapy
Enterprise Information Systems (ICEIS), 20th International Conference on
Blending situation awareness with machine learning to identify children’s speech disorders
12th International Conference on Application of Information and Communication Technologies, AICT 2018, Almaty, Kazakhstan, October 17-19
Evidence of phonological processes in automatic recognition of children’s speech
Sixteenth Annual Conference of the International Speech Communication Association
O uso das estratégias de reparo, considerando a gravidade do desvio fonológico evolutivo
Revista CEFAC
Cited by (12)
A voice feature extraction method based on fractional attribute topology for Parkinson's disease detection
2023, Expert Systems with ApplicationsReference architecture design for computer-based speech therapy systems
2023, Computer Speech and LanguageCitation Excerpt :Ochoa-Guaraca et al. (2016) and Redrovan-Reyes et al. (2019) proposed a general system architecture for their systems. Franciscatto et al. (2021) presented a more detailed software architecture. Detailed information on these studies are discussed in this section.
Progress prediction of Parkinson's disease based on graph wavelet transform and attention weighted random forest
2022, Expert Systems with ApplicationsCitation Excerpt :The main clinical manifestations of this disease are trembling, slow movement, postural balance disorder, and other motor characteristics. At the same time, PD patients will also be accompanied by symptoms such as speech disorders and dysphonia (Franciscatto et al., 2020), and with the continuous progression of PD, the speech ability of patients is declining (Karan et al., 2019). Fortunately, speech disorders are one of the earliest symptoms of PD, which can generally be observed five years before clinical diagnosis (Despotovic et al., 2020).
Deep learning applications in telerehabilitation speech therapy scenarios
2022, Computers in Biology and MedicineDetection of Vowel Errors in Children's Speech using Synthetic Phonetic Transcripts
2023, 2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023