当前位置: X-MOL 学术arXiv.cs.MM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Ensemble Network for Ranking Images Based on Visual Appeal
arXiv - CS - Multimedia Pub Date : 2020-06-06 , DOI: arxiv-2006.03898
Sachin Singh, Victor Sanchez and Tanaya Guha

We propose a computational framework for ranking images (group photos in particular) taken at the same event within a short time span. The ranking is expected to correspond with human perception of overall appeal of the images. We hypothesize and provide evidence through subjective analysis that the factors that appeal to humans are its emotional content, aesthetics and image quality. We propose a network which is an ensemble of three information channels, each predicting a score corresponding to one of the three visual appeal factors. For group emotion estimation, we propose a convolutional neural network (CNN) based architecture for predicting group emotion from images. This new architecture enforces the network to put emphasis on the important regions in the images, and achieves comparable results to the state-of-the-art. Next, we develop a network for the image ranking task that combines group emotion, aesthetics and image quality scores. Owing to the unavailability of suitable databases, we created a new database of manually annotated group photos taken during various social events. We present experimental results on this database and other benchmark databases whenever available. Overall, our experiments show that the proposed framework can reliably predict the overall appeal of images with results closely corresponding to human ranking.

中文翻译:

基于视觉吸引力的图像排序集成网络

我们提出了一个计算框架,用于对短时间内在同一事件中拍摄的图像(特别是合影)进行排序。该排名预计与人类对图像整体吸引力的感知相对应。我们通过主观分析假设并提供证据,认为吸引人类的因素是其情感内容、美学和图像质量。我们提出了一个网络,它是三个信息通道的集合,每个通道预测对应于三个视觉吸引力因素之一的分数。对于群体情绪估计,我们提出了一种基于卷积神经网络 (CNN) 的架构,用于从图像中预测群体情绪。这种新架构强制网络将重点放在图像中的重要区域,并获得与最先进技术相当的结果。下一个,我们为图像排名任务开发了一个网络,该网络结合了群体情感、美学和图像质量分数。由于没有合适的数据库,我们创建了一个新的数据库,其中包含在各种社交活动中拍摄的手动注释的合影。我们在此数据库和其他可用的基准数据库上提供实验结果。总的来说,我们的实验表明,所提出的框架可以可靠地预测图像的整体吸引力,其结果与人类排名密切相关。我们在此数据库和其他可用的基准数据库上提供实验结果。总的来说,我们的实验表明,所提出的框架可以可靠地预测图像的整体吸引力,其结果与人类排名密切相关。我们在此数据库和其他可用的基准数据库上提供实验结果。总的来说,我们的实验表明,所提出的框架可以可靠地预测图像的整体吸引力,其结果与人类排名密切相关。
更新日期:2020-06-09
down
wechat
bug