当前位置: X-MOL 学术Lang. Cogn. Neurosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
LDL-AURIS: a computational model, grounded in error-driven learning, for the comprehension of single spoken words
Language, Cognition and Neuroscience ( IF 2.3 ) Pub Date : 2021-07-21 , DOI: 10.1080/23273798.2021.1954207
Elnaz Shafaei-Bajestan 1 , Masoumeh Moradipour-Tari 1 , Peter Uhrig 2 , R. Harald Baayen 1
Affiliation  

ABSTRACT

A computational model for the comprehension of single spoken words is presented that builds on an earlier model using discriminative learning. Real-valued features are extracted from the speech signal instead of discrete features. Vectors representing word meanings using one-hot encoding are replaced by real-valued semantic vectors. Instead of incremental learning with Rescorla-Wagner updating, we use linear discriminative learning, which captures incremental learning at the limit of experience. These new design features substantially improve prediction accuracy for unseen words, and provide enhanced temporal granularity, enabling the modelling of cohort-like effects. Visualisation with t-SNE shows that the acoustic form space captures phone-like properties. Trained on 9 h of audio from a broadcast news corpus, the model achieves recognition performance that approximates the lower bound of human accuracy in isolated word recognition tasks. LDL-AURIS thus provides a mathematically-simple yet powerful characterisation of the comprehension of single words as found in English spontaneous speech.



中文翻译:

LDL-AURIS:一种基于错误驱动学习的计算模型,用于理解单个口语单词

摘要

提出了一种用于理解单个口语单词的计算模型,该模型建立在使用判别学习的早期模型的基础上。从语音信号中提取实值特征而不是离散特征。使用单热编码表示词义的向量被实值语义向量取代。我们使用线性判别学习,而不是使用 Rescorla-Wagner 更新的增量学习,它在经验的限制下捕获增量学习。这些新的设计特征显着提高了对未见过的词的预测准确性,并提供了增强的时间粒度,从而能够对类群效应进行建模。使用 t-SNE 的可视化显示声学形式空间捕获了类似音素的属性。对来自广播新闻语料库的 9 小时音频进行训练,因此, LDL-AURIS提供了一种数学上简单但功能强大的对英语自发语音中单个单词理解的表征。

更新日期:2021-07-21
down
wechat
bug