A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications,Nature Machine Intelligence

当前位置： X-MOL 学术 › Nat. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications
Nature Machine Intelligence ( IF 23.8 ) Pub Date : 2021-02-08 , DOI: 10.1038/s42256-020-00286-8
Deepak Baby ₁ , Arthur Van Den Broucke ₁ , Sarah Verhulst ₁

Affiliation

Auditory models are commonly used as feature extractors for automatic speech-recognition systems or as front-ends for robotics, machine-hearing and hearing-aid applications. Although auditory models can capture the biophysical and nonlinear properties of human hearing in great detail, these biophysical models are computationally expensive and cannot be used in real-time applications. We present a hybrid approach where convolutional neural networks are combined with computational neuroscience to yield a real-time end-to-end model for human cochlear mechanics, including level-dependent filter tuning (CoNNear). The CoNNear model was trained on acoustic speech material and its performance and applicability were evaluated using (unseen) sound stimuli commonly employed in cochlear mechanics research. The CoNNear model accurately simulates human cochlear frequency selectivity and its dependence on sound intensity, an essential quality for robust speech intelligibility at negative speech-to-background-noise ratios. The CoNNear architecture is based on parallel and differentiable computations and has the power to achieve real-time human performance. These unique CoNNear features will enable the next generation of human-like machine-hearing applications.

中文翻译：

用于实时应用的人类耳蜗力学和滤波器调谐的卷积神经网络模型

听觉模型通常用作自动语音识别系统的特征提取器，或用作机器人、机器听力和助听器应用的前端。尽管听觉模型可以非常详细地捕捉人类听觉的生物物理和非线性特性，但这些生物物理模型的计算成本很高，并且不能用于实时应用。我们提出了一种混合方法，其中卷积神经网络与计算神经科学相结合，为人类耳蜗力学生成实时端到端模型，包括依赖电平的滤波器调谐 (CoNNear)。CoNNear 模型在声学语音材料上进行了训练，并使用耳蜗力学研究中常用的（看不见的）声音刺激来评估其性能和适用性。CoNNear 模型准确地模拟了人类耳蜗的频率选择性及其对声音强度的依赖性，这是在负语音与背景噪声比下实现稳健语音清晰度的基本质量。CoNNear 架构基于并行和可微分计算，具有实现实时人类表现的能力。这些独特的 CoNNear 功能将使下一代类人机器听觉应用成为可能。

更新日期：2021-02-08

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>