当前位置: X-MOL 学术arXiv.cs.MM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Influence of Audio on Video Memorability with an Audio Gestalt Regulated Video Memorability System
arXiv - CS - Multimedia Pub Date : 2021-04-23 , DOI: arxiv-2104.11568
Lorin Sweeney, Graham Healy, Alan F. Smeaton

Memories are the tethering threads that tie us to the world, and memorability is the measure of their tensile strength. The threads of memory are spun from fibres of many modalities, obscuring the contribution of a single fibre to a thread's overall tensile strength. Unfurling these fibres is the key to understanding the nature of their interaction, and how we can ultimately create more meaningful media content. In this paper, we examine the influence of audio on video recognition memorability, finding evidence to suggest that it can facilitate overall video recognition memorability rich in high-level (gestalt) audio features. We introduce a novel multimodal deep learning-based late-fusion system that uses audio gestalt to estimate the influence of a given video's audio on its overall short-term recognition memorability, and selectively leverages audio features to make a prediction accordingly. We benchmark our audio gestalt based system on the Memento10k short-term video memorability dataset, achieving top-2 state-of-the-art results.

中文翻译:

使用音频格式塔调节视频记忆性系统的音频对视频记忆性的影响

记忆力是将我们与世界联系在一起的束缚线,记忆力是衡量其抗拉强度的标准。记忆线是从多种形态的纤维中纺出的,从而掩盖了单根纤维对线的整体抗拉强度的影响。展开这些光纤是了解它们相互作用的性质以及我们最终如何创建更有意义的媒体内容的关键。在本文中,我们研究了音频对视频识别记忆性的影响,并找到证据表明它可以促进具有高级(格式位)音频功能的整体视频识别记忆性。我们介绍了一种新颖的基于多模式深度学习的后期融合系统,该系统使用音频格式手势估算给定视频的音频对其整体短期识别记忆力的影响,并有选择地利用音频功能做出相应的预测。我们在Memento10k短期视频记忆性数据集上对基于音频格式塔的系统进行了基准测试,获得了前2名的最新结果。
更新日期:2021-04-26
down
wechat
bug