当前位置: X-MOL 学术arXiv.cs.MM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Image Aesthetics Prediction Using Multiple Patches Preserving the Original Aspect Ratio of Contents
arXiv - CS - Multimedia Pub Date : 2020-07-05 , DOI: arxiv-2007.02268
Lijie Wang, Xueting Wang and Toshihiko Yamasaki

The spread of social networking services has created an increasing demand for selecting, editing, and generating impressive images. This trend increases the importance of evaluating image aesthetics as a complementary function of automatic image processing. We propose a multi-patch method, named MPA-Net (Multi-Patch Aggregation Network), to predict image aesthetics scores by maintaining the original aspect ratios of contents in the images. Through an experiment involving the large-scale AVA dataset, which contains 250,000 images, we show that the effectiveness of the equal-interval multi-patch selection approach for aesthetics score prediction is significant compared to the single-patch prediction and random patch selection approaches. For this dataset, MPA-Net outperforms the neural image assessment algorithm, which was regarded as a baseline method. In particular, MPA-Net yields a 0.073 (11.5%) higher linear correlation coefficient (LCC) of aesthetics scores and a 0.088 (14.4%) higher Spearman's rank correlation coefficient (SRCC). MPA-Net also reduces the mean square error (MSE) by 0.0115 (4.18%) and achieves results for the LCC and SRCC that are comparable to those of the state-of-the-art continuous aesthetics score prediction methods. Most notably, MPA-Net yields a significant lower MSE especially for images with aspect ratios far from 1.0, indicating that MPA-Net is useful for a wide range of image aspect ratios. MPA-Net uses only images and does not require external information during the training nor prediction stages. Therefore, MPA-Net has great potential for applications aside from aesthetics score prediction such as other human subjectivity prediction.

中文翻译:

使用多块保留内容原始纵横比的图像美学预测

社交网络服务的普及创造了对选择、编辑和生成令人印象深刻的图像的日益增长的需求。这种趋势增加了评估图像美学作为自动图像处理的补充功能的重要性。我们提出了一种名为 MPA-Net(Multi-Patch Aggregation Network)的多补丁方法,通过保持图像内容的原始纵横比来预测图像美学分数。通过涉及包含 250,000 张图像的大规模 AVA 数据集的实验,我们表明,与单补丁预测和随机补丁选择方法相比,等间隔多补丁选择方法对美学分数预测的有效性是显着的。对于这个数据集,MPA-Net 优于神经图像评估算法,这被认为是一种基线方法。特别是,MPA-Net 的美学评分线性相关系数 (LCC) 高 0.073 (11.5%),Spearman 等级相关系数 (SRCC) 高 0.088 (14.4%)。MPA-Net 还将均方误差 (MSE) 降低了 0.0115 (4.18%),并实现了与最先进的连续美学分数预测方法相当的 LCC 和 SRCC 的结果。最值得注意的是,MPA-Net 产生显着较低的 MSE,尤其是对于纵横比远低于 1.0 的图像,这表明 MPA-Net 可用于广泛的图像纵横比。MPA-Net 仅使用图像,在训练和预测阶段不需要外部信息。所以,
更新日期:2020-07-07
down
wechat
bug