当前位置: X-MOL 学术Speech Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Computer-assisted assessment of phonetic fluency in a second language: a longitudinal study of Japanese learners of French
Speech Communication ( IF 3.2 ) Pub Date : 2020-10-08 , DOI: 10.1016/j.specom.2020.10.001
Sylvain Detey , Lionel Fontan , Maxime Le Coz , Saïd Jmel

Automatic second language (L2) speech fluency assessment has been one of the ultimate goals of several projects aiming at designing Computer-Assisted Pronunciation Training (CAPT) tools for L2 learners. Usually, three challenges must be tackled in order to solve the issues at stake: 1) Defining fluency from a threefold interdisciplinary perspective (acoustic and perceptual phonetics, computer science, L2 education); 2) Using a cost-effective algorithm; 3) Testing the procedure with actual learners’ data. Despite rapid technical developments in the field of automatic speech processing, the tools which are actually available for learners are still scarce, and most of them rely on automatic speech recognition (ASR). Moreover, most research on the topic is focusing on English as the target L2. Therefore, in this article, we address the following research questions: (a) is it possible to use a non-ASR-based low-level signal segmentation algorithm to predict human expert assessment of phonetic fluency in beginner Japanese learners of French in a text-reading task during the first stages of their learning? (b) if the answer to (a) is positive, then what are the best predictors of phonetic fluency among a set of available measures (see below for more details)? (c) is it possible to use this algorithm to monitor the evolution of phonetic fluency (and of its associated predictors) in these learners in a longitudinal study? As a first step, a corpus of French sentences read aloud by 12 Japanese learners of different proficiency levels in French was used to design a prediction system. The read-aloud speech data was perceptually annotated by three human experts on four dimensions: overall speech fluency, speech rate, regularity of speech rate, speech fluidity (i.e. smoothness of transitions between phones). Inter-rater agreement and reliability were high for all dimensions, and the average human ratings were compared with the scores provided by our prediction system. The results show strong correlations between human and automatic scores of speech rate and regularity of speech rate, and a weak correlation for speech fluidity. Automatic scores were finally combined together through a multiple linear regression model in order to predict overall speech fluency. The best model led to a correlation coefficient of .92 between automatic and human ratings, with a root-mean-square error of .38. In the second step of this study, a corpus of identical sentences read aloud four times over two years by 12 Japanese learners of French (after 4, 7, 12, and 19 months of French courses in Japan) was fed to the automatic system. The results show regular progress in overall speech fluency, which fits with the regular progress the Japanese learners under scrutiny were expected to make through their academic program in French at their university in Japan every semester. Our study suggests a positive answer to our first and third research questions, with speech rate as the best predictor to answer our second research question. In a pedagogical perspective, it seems that such a simple algorithm could be integrated in a CAPT tool to monitor learners’ progress in phonetic fluency in reading-aloud tasks.



中文翻译:

计算机辅助评估第二语言的语音流利度:日语日语学习者的纵向研究

自动第二语言(L2)语音流畅度评估已成为旨在为L2学习者设计计算机辅助语音训练(CAPT)工具的多个项目的最终目标之一。通常,必须解决三个挑战才能解决所面临的问题:1)从三方面的跨学科角度定义流利性(声学和感知语音学,计算机科学,第二语言教育);2)使用经济高效的算法;3)用实际学习者的数据测试程序。尽管在自动语音处理领域中技术发展迅速,但实际上可供学习者使用的工具仍然稀缺,并且大多数依赖于自动语音识别(ASR)。此外,关于该主题的大多数研究都将英语作为目标L2。因此,在本文中,我们解决了以下研究问题:(a)是否可以使用基于非ASR的低级信号分割算法来预测人类专家在第一次日语阅读中的法语学习中对语音流畅度的专家评估他们的学习阶段?(b)如果(a)的答案是肯定的,那么在一组可用的量度中语音流利度的最佳预测指标是什么(有关更多详细信息,请参见下文)?(c)在纵向研究中,是否可以使用此算法来监控这些学习者的语音流畅性(及其相关预测因子)的演变?第一步,使用由12名不同法语水平的日本学习者朗读的法语句子集来设计预测系统。三位人类专家在四个维度上以语音方式标注了朗读语音数据:总体语音流利度,语音速率,语音速率规律性,语音流畅性(即电话之间的过渡平滑度)。评分员之间的一致性和可靠性在所有维度上都是很高的,我们将人类的平均评分与我们的预测系统提供的评分进行了比较。结果表明,人和自动语音评分与语音规则规律之间的相关性很强,而语音流动性的相关性较弱。最后,通过多重线性回归模型将自动评分合并在一起,以预测总体语音流利度。最佳模型导致自动评级与人工评级之间的相关系数为0.92,均方根误差为0.38。在这项研究的第二步,由12名日语日语学习者(在日本学习法语的4、7、12和19个月后)在两年内大声朗读了四遍相同的句子语料库。结果表明,总体语言流利性方面的定期进步与每学期在日本大学通过法语学习计划的日本学习者的定期进步相吻合。我们的研究提出了对我们的第一个和第三个研究问题的肯定答案,而语速是回答我们的第二个研究问题的最佳预测指标。从教学的角度来看,似乎可以将这种简单的算法集成到CAPT工具中,以监控学习者在朗读任务中语音流利度方面的进度。

更新日期:2020-10-13
down
wechat
bug