当前位置: X-MOL 学术IEEE J. Sel. Top. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Automatic Detection of Speech Disorders in Children: Challenges, Opportunities and Preliminary Results
IEEE Journal of Selected Topics in Signal Processing ( IF 7.5 ) Pub Date : 2020-02-01 , DOI: 10.1109/jstsp.2019.2959393
Mostafa Shahin , Usman Zafar , Beena Ahmed

Given the limited accessibility to Speech and Language Pathologists (SLPs) children in need often have, pediatric Computer-Aided Speech Therapy (CAST) tools can play an important role in the early diagnosis and treatment of speech disorders. However, various challenges impede the implementation of accurate automated analysis of speech disorders in children. In this article, we first discuss three key challenges in processing child disordered speech: 1) the unreliability of low-level annotation and scarcity of speech corpora, 2) speaker diarization of therapy sessions and 3) inaccurate children's acoustic models. We next explore opportunities to overcome some of these challenges. First, we investigate the effectiveness of high-level paralinguistic features in disordered speech detection to reduce the dependency on annotated data. A binary classifier trained using paralinguistic features extracted from both typically developing children and those suffering from Speech Sound Disorders (SSD) achieved 87% subject-level classification accuracy. Second, we tackle the speech disorder detection problem as an anomaly detection problem where models are trained merely on typically developing speech, reducing the need for disordered training data. A phoneme-level F1 score of 0.77 was obtained from an anomaly detection-based system trained on speech attribute features to classify between typical and atypical phoneme pronunciations of children with speech disorder. Finally, we test the efficiency of an x-vector based speaker diarization technique in pediatric therapy sessions. The method successfully distinguished between therapist and child speech with a Diarization Error Rate (DER) of 10%.

中文翻译:

儿童言语障碍的自动检测:挑战、机遇和初步结果

鉴于有需要的儿童通常拥有的言语和语言病理学家 (SLP) 的可及性有限,儿科计算机辅助言语治疗 (CAST) 工具可以在言语障碍的早期诊断和治疗中发挥重要作用。然而,各种挑战阻碍了对儿童言语障碍的准确自动分析的实施。在本文中,我们首先讨论处理儿童无序语音的三个关键挑战:1) 低级注释的不可靠性和语音语料库的稀缺性,2) 治疗会话的说话者分类以及 3) 不准确的儿童声学模型。接下来,我们将探索克服其中一些挑战的机会。首先,我们研究了高级副语言特征在无序语音检测中的有效性,以减少对注释数据的依赖。使用从典型发育儿童和患有言语声音障碍 (SSD) 的儿童中提取的副语言特征训练的二元分类器实现了 87% 的主题级分类准确率。其次,我们将语音障碍检测问题作为一种异常检测问题来解决,其中模型仅在典型的发展语音上进行训练,从而减少了对无序训练数据的需求。0.77 的音素级 F1 分数是从基于异常检测的系统中获得的,该系统对语音属性特征进行了训练,以区分语音障碍儿童的典型和非典型音素发音。最后,我们测试了基于 x 向量的说话人分类技术在儿科治疗过程中的效率。
更新日期:2020-02-01
down
wechat
bug