当前位置: X-MOL 学术Comput. Speech Lang › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Vocal tract shaping of emotional speech.
Computer Speech & Language ( IF 3.1 ) Pub Date : 2020-04-16 , DOI: 10.1016/j.csl.2020.101100
Jangwon Kim 1 , Asterios Toutios 1 , Sungbok Lee 1 , Shrikanth S Narayanan 1
Affiliation  

Emotional speech production has been previously studied using fleshpoint tracking data in speaker-specific experiment setups. The present study introduces a real-time magnetic resonance imaging database of emotional speech production from 10 speakers and presents articulatory analysis results of speech emotional expression using the database. Midsagittal vocal tract parameters (midsagittal distances and the vocal tract length) were parameterized based on a two-dimensional grid-line system, using image segmentation software. The principal feature analysis technique was applied to the grid-line system in order to find the major movement locations. Results reveal both speaker-dependent and speaker-independent variation patterns. For example, sad speech, a low arousal emotion, tends to show smaller opening for low vowels in the front cavity than the high arousal emotions more consistently than the other regions of the vocal tract. Happiness shows significantly shorter vocal tract length than anger and sadness in most speakers. Further details of speaker-dependent and speaker-independent speech articulation variation in emotional expression and their implications are described.



中文翻译:

情感言语的声带塑造。

先前已经在特定于说话者的实验设置中使用肉点跟踪数据研究了情感语音的产生。本研究引入了一个实时磁共振成像数据库,用于产生来自 10 个说话者的情感语音,并使用该数据库呈现语音情感表达的发音分析结果。中矢状声道参数(中矢状距离和声道长度)基于二维网格线系统使用图像分割软件进行参数化。主要特征分析技术被应用于网格线系统,以找到主要的运动位置。结果揭示了说话人相关和说话人无关的变化模式。例如,悲伤的演讲,低唤醒情绪,与声道的其他区域相比,前腔中的低元音往往比高唤醒情绪更一致。在大多数演讲者中,快乐的声道长度明显短于愤怒和悲伤。描述了情绪表达中依赖于说话者和独立于说话者的语音清晰度变化的更多细节及其含义。

更新日期:2020-04-16
down
wechat
bug