The Variability of Vowels' Formants in Forensic Speech,IEEE Instrumentation & Measurement Magazine

当前位置： X-MOL 学术 › IEEE Instrum. Meas. Mag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

The Variability of Vowels' Formants in Forensic Speech
IEEE Instrumentation & Measurement Magazine ( IF 1.6 ) Pub Date : 2021-02-02 , DOI: 10.1109/mim.2021.9345600
Sonia Cenceschi ₁ , Chiara Meluzzi ₂ , Alessandro Trivilini ₃

Affiliation

Speech analysis plays a pivotal role in the exploitation of forensic recordings in resolving a wide range of questions. Although this topic may include a large set of methodologies based on varied digital features (MFCC, Centroid, Harmonicity, VOT, etc., cf. [1]), this work approaches the theme from a phonetic perspective, taking into consideration the vowel formants, and focusing on their variability and difficulty of measurement. Formants correspond to the resonant frequencies of the vocal tract and are, therefore, sensible to specific speaker-related variations such as age and sex. In this respect, formants' variation contributes to characterizing the subjective timbre of the person. For this reason, formants' values are largely used in forensics [2], with all the practical problems that come with them, and in particular when dealing with speaker recognition or discrimination [3], [4]. Indeed, formants' values correspond to specific frequencies of the sound signal and are usually reported in Hertz. They are, however, affected by numerous internal and external variables, so that although on average they are characteristic of the individual speaker, they always vary within frequency bands that cannot be defined in absolute terms [5]. How, then, is it possible to provide reliable answers to forensic questions? In essence, it is up to the specialist to determine if there are the conditions to carry out an analysis, and to understand, for example, whether the differences between formants' values could be ascribed to two different speakers or the difference is too subtle to justify this claim. What should be measurable today, and must be defined at the level of jurisprudence, is therefore the professionalism of the expert. However, this is still heterogeneous in different countries, although several codes of practice such as the International Association for Forensic Phonetics or engineering society ones have been validated.

中文翻译：

法医语音中元音共振峰的变化

在解决各种问题时，语音分析在法医录音的开发中起着关键作用。尽管本主题可能包括基于各种数字功能（MFCC，质心，谐波，VOT等，请参阅[1]）的大量方法，但是该工作从语音角度出发，并考虑了元音共振峰。，并着重于它们的可变性和测量难度。共振峰对应于声道的共振频率，因此对特定的与说话者相关的变化（例如年龄和性别）敏感。在这方面，共振峰的变化有助于表征人的主观音色。出于这个原因，共振峰的价值观主要用于法医[2]中，伴随着它们带来的所有实际问题，特别是在处理说话者的识别或歧视[3]，[4]时。实际上，共振峰的值对应于声音信号的特定频率，通常以赫兹为单位进行报告。但是，它们受许多内部和外部变量的影响，因此，尽管它们平均而言是每个扬声器的特征，但它们始终在无法用绝对术语定义的频带内变化[5]。那么，如何为法医问题提供可靠的答案？从本质上讲，由专家来确定是否存在进行分析的条件，并了解例如共振峰值之间的差异是否可以归因于两个不同的说话者，或者该差异是否太细微以至于无法确定。证明此主张合理。今天应该衡量的是因此，必须在法学层面上加以定义，这是专家的专业精神。但是，尽管已经验证了一些实践准则，例如国际法证语音协会或工程学会的准则，但在不同国家这仍然是不同的。

更新日期：2021-02-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11