Listeners track talker-specific prosody to deal with talker-variability,Brain Research

当前位置： X-MOL 学术 › Brain Res. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Listeners track talker-specific prosody to deal with talker-variability
Brain Research ( IF 2.7 ) Pub Date : 2021-08-05 , DOI: 10.1016/j.brainres.2021.147605
Giulio G A Severijnen ₁ , Hans Rutger Bosker ₂ , Vitória Piai ₃ , James M McQueen ₄

Affiliation

One of the challenges in speech perception is that listeners must deal with considerable segmental and suprasegmental variability in the acoustic signal due to differences between talkers. Most previous studies have focused on how listeners deal with segmental variability. In this EEG experiment, we investigated whether listeners track talker-specific usage of suprasegmental cues to lexical stress to recognize spoken words correctly. In a three-day training phase, Dutch participants learned to map non-word minimal stress pairs onto different object referents (e.g., USklot meant “lamp”; usKLOT meant “train”). These non-words were produced by two male talkers. Critically, each talker used only one suprasegmental cue to signal stress (e.g., Talker A used only F0 and Talker B only intensity). We expected participants to learn which talker used which cue to signal stress. In the test phase, participants indicated whether spoken sentences including these non-words were correct (“The word for lamp is…”). We found that participants were slower to indicate that a stimulus was correct if the non-word was produced with the unexpected cue (e.g., Talker A using intensity). That is, if in training Talker A used F0 to signal stress, participants experienced a mismatch between predicted and perceived phonological word-forms if, at test, Talker A unexpectedly used intensity to cue stress. In contrast, the N200 amplitude, an event-related potential related to phonological prediction, was not modulated by the cue mismatch. Theoretical implications of these contrasting results are discussed. The behavioral findings illustrate talker-specific prediction of prosodic cues, picked up through perceptual learning during training.

中文翻译：

听众跟踪说话者特定的韵律以处理说话者的可变性

语音感知的挑战之一是听者必须处理由于说话者之间的差异而导致的声学信号中相当大的分段和超分段可变性。大多数先前的研究都集中在听众如何处理分段可变性上。在这个脑电图实验中，我们调查了听众是否跟踪谈话者对词汇压力的超音段线索的特定使用，以正确识别口语单词。在为期三天的培训阶段，荷兰参与者学会了将非单词最小重音对映射到不同的对象所指对象（例如，USklot 的意思是“灯”；usKLOT意思是“火车”）。这些非词是由两个男性说话者产生的。至关重要的是，每个说话者只使用一个超片段提示来表示压力（例如，说话者 A 只使用 F0 和说话者 B 只使用强度）。我们希望参与者了解哪个说话者使用哪个提示来表示压力。在测试阶段，参与者指出包括这些非单词的口语句子是否正确（“灯的单词是……”）。我们发现，如果非单词是由意外提示产生的（例如，说话者 A 使用强度），参与者表示刺激是正确的速度较慢。也就是说，如果在训练说话者 A 使用 F0 来表示压力时，如果在测试中说话者 A 意外地使用强度来提示压力，则参与者会经历预测和感知的语音词形之间的不匹配。相比之下，N200 幅度，与语音预测相关的事件相关电位不受提示不匹配的调节。讨论了这些对比结果的理论意义。行为研究结果说明了说话者对韵律线索的特定预测，这是通过训练期间的感知学习获得的。

更新日期：2021-08-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11