Cochleogram-based approach for detecting perceived emotions in music,Information Processing & Management

当前位置： X-MOL 学术 › Inf. Process. Manag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Cochleogram-based approach for detecting perceived emotions in music
Information Processing & Management ( IF 7.4 ) Pub Date : 2020-04-28 , DOI: 10.1016/j.ipm.2020.102270
Mladen Russo , Luka Kraljević , Maja Stella , Marjan Sikora

Identifying perceived emotional content of music constitutes an important aspect of easy and efficient search, retrieval, and management of the media. One of the most promising use cases of music organization is an emotion-based playlist, where automatic music emotion recognition plays a significant role in providing emotion related information, which is otherwise, generally unavailable. Based on the importance of the auditory system in emotional recognition and processing, in this study, we propose a new cochleogram-based system for detecting the affective musical content. To effectively simulate the response of the human auditory periphery, the music audio signal is processed by a detailed biophysical cochlear model, thus obtaining an output that closely matches the characteristics of human hearing. In this proposed approach, based on the cochleogram images, which we construct directly from the response of the basilar membrane, a convolutional neural network (CNN) is used to extract the relevant music features. To validate the practical implications of the proposed approach with regard to its possible integration in different digital music libraries, an extensive study was conducted to evaluate the predictive performance of our approach in different aspects of music emotion recognition. The proposed approach was evaluated on publicly available 1000 songs database and the experimental results showed that it performed better in comparison with common musical features (such as tempo, mode, pitch, clarity, and perceptually motivated mel-frequency cepstral coefficients (MFCC)) as well as official ”MediaEval” challenge results on the same reference database. Our findings clearly show that the proposed approach can lead to better music emotion recognition performance and be used as part of a state-of-the-art music information retrieval system.

中文翻译：

基于耳蜗的方法来检测音乐中的感知情绪

识别音乐的感知情感内容是轻松，高效地搜索，检索和管理媒体的重要方面。音乐组织最有希望的用例之一是基于情感的播放列表，其中自动音乐情感识别在提供情感相关信息方面起着重要作用，否则通常是不可用的。基于听觉系统在情感识别和处理中的重要性，在这项研究中，我们提出了一种基于耳蜗的新系统来检测情感音乐内容。为了有效地模拟人类听觉外围的响应，音乐音频信号通过详细的生物物理耳蜗模型进行处理，从而获得与人类听觉特性非常匹配的输出。在这种提议的方法中，基于我们直接根据基底膜的响应构建的耳蜗图像，使用卷积神经网络（CNN）提取相关的音乐特征。为了验证该方法在不同数字音乐库中的可能集成方面的实际意义，进行了广泛的研究以评估我们的方法在音乐情感识别的不同方面的预测性能。该方法在公开的1000首歌曲数据库中进行了评估，实验结果表明，与常见的音乐功能（例如为了验证该方法在不同数字音乐库中的可能集成方面的实际意义，进行了广泛的研究以评估我们的方法在音乐情感识别的不同方面的预测性能。该方法在公开的1000首歌曲数据库中进行了评估，实验结果表明，与常见的音乐功能（例如为了验证所提出的方法在不同数字音乐库中的可能集成方面的实际意义，进行了广泛的研究以评估我们的方法在音乐情感识别的不同方面的预测性能。该方法在公开的1000首歌曲数据库中进行了评估，实验结果表明，与常见的音乐功能（例如速度，模式，音调，清晰度和感性动机的梅尔频率倒谱系数（MFCC），以及官方“ MediaEval”质询结果都在同一参考数据库上。我们的发现清楚地表明，所提出的方法可以带来更好的音乐情感识别性能，并可以用作最先进的音乐信息检索系统的一部分。

更新日期：2020-04-28

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11