当前位置: X-MOL 学术J. Acoust. Soc. Am. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The evolution of drum modes with strike intensity: Analysis and synthesis using the discrete cosine transform
The Journal of the Acoustical Society of America ( IF 2.1 ) Pub Date : 2021-07-12 , DOI: 10.1121/10.0005509
Tim Kirby 1 , Mark Sandler 1
Affiliation  

The synthesis of convincing acoustic drum sounds remains an open problem. In this paper, a method for analysing and synthesising pitch glide in drums is proposed, whereby the discrete cosine transform (DCT) of an unwindowed drum sound is modelled. This is an extension of the scheme initially proposed by Kirby and Sandler [(2020). Proceedings of the 23rd International Conference on Digital Audio Effects, Vienna, Austria, pp. 155–162], which was able to reproduce key components of drum sounds accurately enough that they could not be distinguished from the reference samples. Here, drum modes were analysed in greater detail for a tom-tom struck at 67 different intensities to investigate their evolution with strike velocity. A clear evolution was observed in the DCT features, and interpolation was used to synthesise the modes of intermediate velocity. These synthesised modes were evaluated objectively through null testing, which showed that a continuous blending of strike velocities could be achieved throughout the data set. An AB listening test was also performed, where 20 participants attempted to distinguish between pairs of real and synthesised sounds. Exactly 50% accuracy was achieved overall, which demonstrates that the synthesised samples were deemed to sound as realistic as genuine samples. These results demonstrate that the DCT representation is a valuable framework for analysis and synthesis of drum sounds. It is also likely that this approach could be applied to other instruments.

中文翻译:

具有打击强度的鼓模式的演变:使用离散余弦变换的分析和合成

令人信服的原声鼓声音的合成仍然是一个悬而未决的问题。在本文中,提出了一种分析和合成鼓音滑音的方法,由此对未加窗的鼓声的离散余弦变换 (DCT) 进行建模。这是 Kirby 和 Sandler [(2020) 最初提出的方案的扩展。第 23 届数字音频效果国际会议论文集, Vienna, Austria, pp. 155–162],它能够足够准确地再现鼓声的关键组成部分,以至于无法将它们与参考样本区分开来。在这里,对以 67 种不同强度敲击的嗵鼓模式进行了更详细的分析,以研究它们随敲击速度的演变。在 DCT 特征中观察到明显的演变,并且使用插值来合成中间速度的模式。这些合成模式是通过零值测试客观评估的,这表明可以在整个数据集内实现连续混合的冲击速度。还进行了 AB 听力测试,其中 20 名参与者试图区分真实和合成声音对。总体准确率达到了 50%,这表明合成的样本听起来与真实样本一样逼真。这些结果表明 DCT 表示是分析和合成鼓声音的有价值的框架。这种方法也很可能适用于其他文书。
更新日期:2021-07-12
down
wechat
bug