Neural speech restoration at the cocktail party: Auditory cortex recovers masked speech of both attended and ignored speakers,PLOS Biology

当前位置： X-MOL 学术 › PLOS Biol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Neural speech restoration at the cocktail party: Auditory cortex recovers masked speech of both attended and ignored speakers
PLOS Biology ( IF 9.8 ) Pub Date : 2020-10-22 , DOI: 10.1371/journal.pbio.3000883
Christian Brodbeck , Alex Jiao , L. Elliot Hong , Jonathan Z. Simon

Humans are remarkably skilled at listening to one speaker out of an acoustic mixture of several speech sources. Two speakers are easily segregated, even without binaural cues, but the neural mechanisms underlying this ability are not well understood. One possibility is that early cortical processing performs a spectrotemporal decomposition of the acoustic mixture, allowing the attended speech to be reconstructed via optimally weighted recombinations that discount spectrotemporal regions where sources heavily overlap. Using human magnetoencephalography (MEG) responses to a 2-talker mixture, we show evidence for an alternative possibility, in which early, active segregation occurs even for strongly spectrotemporally overlapping regions. Early (approximately 70-millisecond) responses to nonoverlapping spectrotemporal features are seen for both talkers. When competing talkers’ spectrotemporal features mask each other, the individual representations persist, but they occur with an approximately 20-millisecond delay. This suggests that the auditory cortex recovers acoustic features that are masked in the mixture, even if they occurred in the ignored speech. The existence of such noise-robust cortical representations, of features present in attended as well as ignored speech, suggests an active cortical stream segregation process, which could explain a range of behavioral effects of ignored background speech.

中文翻译：

鸡尾酒会上的神经语音恢复：听觉皮层恢复参加者和被忽略者的掩盖言语

在聆听几种语音源的声音混合体中的一种说话者方面，人类非常熟练。即使没有双耳提示，也很容易将两个说话者隔离开来，但是这种能力所基于的神经机制尚不十分清楚。一种可能性是，早期皮质处理对声学混合物进行了光谱时间分解，从而允许通过最佳加权重组来重构出席的语音，该重组减少了光源严重重叠的光谱时间区域。使用人类脑磁图（MEG）对2-talker混合物的反应，我们显示了另一种可能性的证据，在这种可能性中，即使对于强烈的光谱时间重叠区域，也会发生早期主动隔离。两位讲话者都可以看到对非重叠光谱时态特征的早期响应（约70毫秒）。当相互竞争的讲话者的光谱时态特征相互掩盖时，单个的表示会持续存在，但是会以大约20毫秒的延迟发生。这表明听觉皮层可以恢复混合物中掩盖的声学特征，即使它们出现在被忽略的语音中也是如此。这样的噪声健壮的皮层表示的存在，以及出席和被忽略的语音中存在的特征，表明存在活跃的皮层流分离过程，这可以解释被忽略的背景语音的一系列行为效应。这表明听觉皮层可以恢复混合物中掩盖的声学特征，即使它们出现在被忽略的语音中也是如此。这样的噪声健壮的皮层表示的存在，以及出席和被忽略的语音中存在的特征，表明存在活跃的皮层流分离过程，这可以解释被忽略的背景语音的一系列行为效应。这表明听觉皮层可以恢复混合物中掩盖的声学特征，即使它们出现在被忽略的语音中也是如此。这样的噪声健壮的皮层表示的存在，以及出席和被忽略的语音中存在的特征，表明存在活跃的皮层流分离过程，这可以解释被忽略的背景语音的一系列行为效应。

更新日期：2020-10-30

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>