当前位置: X-MOL 学术Digit. Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Pitch and noise normalized acoustic feature for children's ASR
Digital Signal Processing ( IF 2.9 ) Pub Date : 2020-11-27 , DOI: 10.1016/j.dsp.2020.102922
Ishwar Chandra Yadav , Gayadhar Pradhan

In this work, we have analyzed the pitch robustness of the recently reported power normalized cepstral coefficient (PNCC) feature for noise robust children's speech recognition. The PNCC feature is intended to suppress various types of common additive noise present in speech data. The PNCC feature is noted to be susceptible to pitch variations. In order to normalize the pitch effect on PNCC, a pitch base spectral normalization step is incorporated in the PNCC feature extraction process. By the inclusion of pitch normalization stage, the pitch-robustness of PNCC is enhanced without loss of noise suppression capability. The efficacy of the proposed pitch normalized PNCC (PN-PNCC) feature is investigated using acoustic modeling based on recurrent structures, such as gated recurrent units (GRU), and light gated recurrent units (LiGRU). The PN-PNCC feature is noted to yield performance improvement over PNCC for adults' and children's speech recognition under clean as well as the noisy test condition.



中文翻译:

儿童ASR的音高和噪声归一化声学功能

在这项工作中,我们分析了最近报道的功率归一化倒谱系数(PNCC)功能的音高鲁棒性,用于噪声鲁棒的儿童语音识别。PNCC功能旨在抑制语音数据中存在的各种类型的常见加性噪声。注意到PNCC特性容易受到音高变化的影响。为了归一化对PNCC的音高效应,在PNCC特征提取过程中加入了基于音高的频谱归一化步骤。通过包含音调归一化阶段,可以增强PNCC的音调稳健性,而不会降低噪声抑制能力。使用基于循环结构(如门控循环单元(GRU)和轻门控循环单元(LiGRU))的声学模型,研究了建议的音调归一化PNCC(PN-PNCC)功能的功效。

更新日期:2020-12-02
down
wechat
bug