当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Multimodality Learning for UAV Video Aesthetic Quality Assessment
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-11-04 , DOI: arxiv-2011.02356
Qi Kuang, Xin Jin, Qinping Zhao, Bin Zhou

Despite the growing number of unmanned aerial vehicles (UAVs) and aerial videos, there is a paucity of studies focusing on the aesthetics of aerial videos that can provide valuable information for improving the aesthetic quality of aerial photography. In this article, we present a method of deep multimodality learning for UAV video aesthetic quality assessment. More specifically, a multistream framework is designed to exploit aesthetic attributes from multiple modalities, including spatial appearance, drone camera motion, and scene structure. A novel specially designed motion stream network is proposed for this new multistream framework. We construct a dataset with 6,000 UAV video shots captured by drone cameras. Our model can judge whether a UAV video was shot by professional photographers or amateurs together with the scene type classification. The experimental results reveal that our method outperforms the video classification methods and traditional SVM-based methods for video aesthetics. In addition, we present three application examples of UAV video grading, professional segment detection and aesthetic-based UAV path planning using the proposed method.

中文翻译:

无人机视频美学质量评估的深度多模态学习

尽管无人机(UAV)和航拍视频的数量不断增加,但很少有研究关注航拍视频的美学,可以为提高航拍的美学质量提供有价值的信息。在本文中,我们提出了一种用于无人机视频美学质量评估的深度多模态学习方法。更具体地说,多流框架旨在利用多种形式的美学属性,包括空间外观、无人机相机运动和场景结构。针对这种新的多流框架提出了一种新颖的专门设计的运动流网络。我们构建了一个数据集,其中包含由无人机摄像机拍摄的 6,000 个无人机视频镜头。我们的模型可以结合场景类型分类来判断无人机视频是由专业摄影师还是业余爱好者拍摄的。实验结果表明,我们的方法优于视频分类方法和传统的基于 SVM 的视频美学方法。此外,我们使用所提出的方法提出了无人机视频分级、专业段检测和基于美学的无人机路径规划的三个应用示例。
更新日期:2020-11-05
down
wechat
bug