Feature trajectory dynamic time warping for clustering of speech segments,EURASIP Journal on Audio, Speech, and Music Processing

当前位置： X-MOL 学术 › EURASIP J. Audio Speech Music Proc. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Feature trajectory dynamic time warping for clustering of speech segments
EURASIP Journal on Audio, Speech, and Music Processing ( IF 2.4 ) Pub Date : 2019-04-04 , DOI: 10.1186/s13636-019-0149-9
Lerato Lerato , Thomas Niesler

Dynamic time warping (DTW) can be used to compute the similarity between two sequences of generally differing length. We propose a modification to DTW that performs individual and independent pairwise alignment of feature trajectories. The modified technique, termed feature trajectory dynamic time warping (FTDTW), is applied as a similarity measure in the agglomerative hierarchical clustering of speech segments. Experiments using MFCC and PLP parametrisations extracted from TIMIT and from the Spoken Arabic Digit Dataset (SADD) show consistent and statistically significant improvements in the quality of the resulting clusters in terms of F-measure and normalised mutual information (NMI).

中文翻译：

用于语音片段聚类的特征轨迹动态时间扭曲

动态时间扭曲 (DTW) 可用于计算长度通常不同的两个序列之间的相似性。我们建议对 DTW 进行修改，该修改执行特征轨迹的单独和独立成对对齐。修改后的技术，称为特征轨迹动态时间扭曲（FTDTW），被用作语音段凝聚层次聚类中的相似性度量。使用从 TIMIT 和口语阿拉伯数字数据集 (SADD) 中提取的 MFCC 和 PLP 参数化的实验表明，在 F 度量和归一化互信息 (NMI) 方面，所得聚类的质量在统计上具有一致的显着改善。

更新日期：2019-04-04

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>