当前位置: X-MOL 学术Speech Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Phonetic accommodation to natural and synthetic voices: Behavior of groups and individuals in speech shadowing
Speech Communication ( IF 3.2 ) Pub Date : 2020-12-29 , DOI: 10.1016/j.specom.2020.12.004
Iona Gessinger , Eran Raveh , Ingmar Steiner , Bernd Möbius

The present study investigates whether native speakers of German phonetically accommodate to natural and synthetic voices in a shadowing experiment. We aim to determine whether this phenomenon, which is frequently found in HHI, also occurs in HCI involving synthetic speech. The examined features pertain to different phonetic domains: allophonic variation, schwa epenthesis, realization of pitch accents, word-based temporal structure and distribution of spectral energy. On the individual level, we found that the participants converged to varying subsets of the examined features, while they maintained their baseline behavior in other cases or, in rare instances, even diverged from the model voices. This shows that accommodation with respect to one particular feature may not predict the behavior with respect to another feature. On the group level, the participants of the natural condition converged to all features under examination, however very subtly so for schwa epenthesis. The synthetic voices, while partly reducing the strength of effects found for the natural voices, triggered accommodating behavior as well. The predominant pattern for all voice types was convergence during the interaction followed by divergence after the interaction.



中文翻译:

自然和合成语音的语音适应:语音阴影中的群体和个人行为

本研究调查了以德语为母语的人在遮蔽实验中是否在语音上适应自然和合成声音。我们的目的是确定这种现象(在HHI中经常发现)是否也在涉及合成语音的HCI中发生。所检查的特征涉及不同的语音域:等音变体,schwa ethethethesis,音高重音的实现,基于单词的时间结构和频谱能量的分布。在个人层面上,我们发现参与者会聚到所检查特征的不同子集,而在其他情况下或在极少数情况下甚至与模型声音有所偏离的情况下,他们仍保持基线行为。这表明,针对一个特定功能的适应可能无法预测针对另一功能的行为。在小组一级,自然条件的参与者会聚到所检查的所有特征,但是对于schwa epenthesis来说却非常微妙。合成声音虽然部分降低了自然声音的效果强度,但也触发了调节行为。所有语音类型的主要模式是在交互过程中收敛,然后在交互之后发散。

更新日期:2021-01-06
down
wechat
bug