当前位置: X-MOL 学术Discret. Dyn. Nat. Soc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Chinese Tone Recognition Based on 3D Dynamic Muscle Information
Discrete Dynamics in Nature and Society ( IF 1.4 ) Pub Date : 2020-05-31 , DOI: 10.1155/2020/5476896
JianRong Wang 1, 2 , Li Wan 1 , Ju Zhang 1 , Qiang Fang 3 , Fan Yang 1 , Jing Hu 1
Affiliation  

To advance the study of lip-reading recognition in accordance with Chinese pronunciation norms, we carefully investigated Mandarin tone recognition based on visual information, in contrast to that of the previous character-based Chinese lip reading technique. In this paper, we mainly studied the vowel tonal transformation in Chinese pronunciation and designed a lightweight skipping convolution network framework (SCNet). And, the experimental results showed that the SCNet was sensitive to the more detailed description of the pitch change than that of the traditional model and achieved a better tone recognition effect and outstanding antiinterference performance. In addition, we conducted a more detailed study on the assistance of the deep texture information in lip-reading recognition. We found that the deep texture information has a significant effect on tone recognition, and the possibility of multimodal lip reading in Chinese tone recognition was confirmed. Similarly, we verified the role of the SCNet syllable tone recognition and found that the vowel and syllable tone recognition accuracy of our model was as high as 97.3%, which also showed the robustness of our proposed method for Chinese tone recognition and it can be widely used for tone recognition.

中文翻译:

基于3D动态肌肉信息的中文语音识别

为了推进根据汉语发音规范的唇读识别研究,我们与基于字符的汉语唇读技术相比,仔细研究了基于视觉信息的普通话音调识别。在本文中,我们主要研究了汉语发音中的元音调变,并设计了一个轻量级的跳过卷积网络框架(SCNet)。并且,实验结果表明,与传统模型相比,SCNet对音高变化的更详细描述敏感,并获得了更好的音调识别效果和出色的抗干扰性能。此外,我们对深层纹理信息在唇读识别中的帮助进行了更详细的研究。我们发现,深层的纹理信息对口气识别具有显着影响,并证实了在汉口声识别中多峰唇读的可能性。同样,我们验证了SCNet音节语音识别的作用,发现我们模型的元音和音节语音识别准确率高达97.3%,这也表明了我们提出的汉语音调识别方法的鲁棒性,并且可以广泛地应用。用于音调识别。
更新日期:2020-05-31
down
wechat
bug