当前位置: X-MOL 学术arXiv.cs.HC › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
VideoModerator: A Risk-aware Framework for Multimodal Video Moderation in E-Commerce
arXiv - CS - Human-Computer Interaction Pub Date : 2021-09-08 , DOI: arxiv-2109.03479
Tan Tang, Yanhong Wu, Lingyun Yu, Yuhong Li, Yingcai Wu

Video moderation, which refers to remove deviant or explicit content from e-commerce livestreams, has become prevalent owing to social and engaging features. However, this task is tedious and time consuming due to the difficulties associated with watching and reviewing multimodal video content, including video frames and audio clips. To ensure effective video moderation, we propose VideoModerator, a risk-aware framework that seamlessly integrates human knowledge with machine insights. This framework incorporates a set of advanced machine learning models to extract the risk-aware features from multimodal video content and discover potentially deviant videos. Moreover, this framework introduces an interactive visualization interface with three views, namely, a video view, a frame view, and an audio view. In the video view, we adopt a segmented timeline and highlight high-risk periods that may contain deviant information. In the frame view, we present a novel visual summarization method that combines risk-aware features and video context to enable quick video navigation. In the audio view, we employ a storyline-based design to provide a multi-faceted overview which can be used to explore audio content. Furthermore, we report the usage of VideoModerator through a case scenario and conduct experiments and a controlled user study to validate its effectiveness.

中文翻译:

VideoModerator:电子商务中多模式视频审核的风险意识框架

视频审核,即从电子商务直播中删除异常或露骨的内容,由于社交和引人入胜的功能而变得普遍。然而,由于与观看和查看多模式视频内容(包括视频帧和音频剪辑)相关的困难,这项任务既乏味又耗时。为了确保有效的视频审核,我们提出了 VideoModerator,这是一种风险感知框架,可将人类知识与机器洞察力无缝集成。该框架结合了一组先进的机器学习模型,以从多模态视频内容中提取风险感知特征并发现潜在的异常视频。此外,该框架引入了具有三个视图的交互式可视化界面,即视频视图、帧视图和音频视图。在视频视图中,我们采用分段时间线并突出显示可能包含异常信息的高风险时期。在框架视图中,我们提出了一种新颖的视觉摘要方法,该方法结合了风险感知功能和视频上下文,以实现快速视频导航。在音频视图中,我们采用基于故事情节的设计来提供可用于探索音频内容的多方面概览。此外,我们通过案例场景报告 VideoModerator 的使用情况,并进行实验和受控用户研究以验证其有效性。我们采用基于故事情节的设计来提供多方面的概述,可用于探索音频内容。此外,我们通过案例场景报告 VideoModerator 的使用情况,并进行实验和受控用户研究以验证其有效性。我们采用基于故事情节的设计来提供多方面的概述,可用于探索音频内容。此外,我们通过案例场景报告 VideoModerator 的使用情况,并进行实验和受控用户研究以验证其有效性。
更新日期:2021-09-09
down
wechat
bug