当前位置: X-MOL 学术Laboratory Phonology › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
New Methods for Prosodic Transcription: Capturing Variability as a Source of Information
Laboratory Phonology ( IF 1.761 ) Pub Date : 2016-06-30 , DOI: 10.5334/labphon.29
Jennifer Cole , Stefanie Shattuck-Hufnagel

Understanding the role of prosody in encoding linguistic meaning and in shaping phonetic form requires the analysis of prosodically annotated speech drawn from a wide variety of speech materials. Yet obtaining accurate and reliable prosodic annotations for even small datasets is challenging due to the time and expertise required. We discuss several factors that make prosodic annotation difficult and impact its reliability, all of which relate to variability : in the patterning of prosodic elements (features and structures) as they relate to the linguistic and discourse context, in the acoustic cues for those prosodic elements, and in the parameter values of the cues. We propose two novel methods for prosodic transcription that capture variability as a source of information relevant to the linguistic analysis of prosody. The first is Rapid Prosody Transcription (RPT), which can be performed by non-experts using a simple set of unary labels to mark prominence and boundaries based on immediate auditory impression. Inter-transcriber variability is used to calculate continuous-valued prosody ‘scores’ that are assigned to each word and represent the perceptual salience of its prosodic features or structure. RPT can be used to model the relative influence of top-down factors and acoustic cues in prosody perception, and to model prosodic variation across many dimensions, including language variety,speech style, or speaker’s affect. The second proposed method is the identification of individual cues to the contrastive prosodic elements of an utterance. Cue specification provides a link between the contrastive symbolic categories of prosodic structures and the continuous-valued parameters in the acoustic signal, and offers a framework for investigating how factors related to the grammatical and situational context influence the phonetic form of spoken words and phrases. While cue specification as a transcription tool has not yet been explored as RPT has, it has the potential to provide a level of detail that will be useful in modelling systematic context-governed variation in the implementation of prosodic categories, with applications in automatic speech synthesis and recognition, as well as modelling human speech production and perception. We discuss how RPT and cue specification, particularly when combined, can improve the efficiency and reliability of prosodic transcription and how they can be integrated with expert phonological transcription.

中文翻译:

韵律转录的新方法:捕捉可变性作为信息来源

理解韵律在编码语言意义和塑造语音形式中的作用需要分析从各种语音材料中提取的带有韵律注释的语音。然而,由于所需的时间和专业知识,即使是小数据集也很难获得准确可靠的韵律注释。我们讨论了使韵律注释变得困难并影响其可靠性的几个因素,所有这些都与可变性有关:在韵律元素(特征和结构)的模式中,因为它们与语言和话语上下文有关,在这些韵律元素的声学线索中,并在提示的参数值中。我们提出了两种韵律转录的新方法,它们将可变性作为与韵律语言分析相关的信息来源。第一个是快速韵律转录 (RPT),非专家可以使用一组简单的一元标签根据直接听觉印象标记突出和边界来执行。转录器间可变性用于计算分配给每个单词的连续值韵律“分数”,并表示其韵律特征或结构的感知显着性。RPT 可用于模拟自上而下因素和声学线索在韵律感知中的相对影响,并模拟多个维度的韵律变化,包括语言多样性、演讲风格或说话者的影响。第二种建议的方法是识别话语中对比韵律元素的个别线索。提示规范提供了韵律结构的对比符号类别与声学信号中的连续值参数之间的联系,并提供了一个框架,用于研究与语法和情境上下文相关的因素如何影响口语单词和短语的语音形式。虽然提示规范作为转录工具尚未像 RPT 那样被探索,但它有可能提供一定程度的细节,这将有助于对韵律类别实现中的系统上下文控制变化进行建模,并应用于自动语音合成和识别,以及模拟人类语音的产生和感知。我们讨论了 RPT 和提示规范,特别是在组合时,
更新日期:2016-06-30
down
wechat
bug