当前位置: X-MOL 学术Color Res. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Subjective evaluation of colourized images with different colorization models
Color Research and Application ( IF 1.4 ) Pub Date : 2020-11-20 , DOI: 10.1002/col.22593
Xiao Teng 1 , Zhijiang Li 1 , Qiang Liu 1, 2 , Michael R. Pointer 3 , Zheng Huang 1 , Hongguang Sun 1
Affiliation  

Two psychophysical experiments were conducted to evaluate the performance of grayscale image colorization models, and to verify the objective image quality metrics adopted in grayscale image colorization. Twenty representative grayscale images were colourized by four colorization models and three typical metrics, root mean square error (RMSE), peak signal‐to‐noise ratio (PSNR) and structural similarity index (SSIM), were used to characterize the objective quality of the colourized images. Forty observers were asked to evaluate those images based on their subjective preference in a pair‐comparison experiment, and to evaluate the perceived similarity between the generated and reference colour images using a seven‐point rating scale. The experimental results indicate that different colorization models and objective metrics exhibit different performance in different scenarios. Each colorization method has its own advantages and disadvantages while none of the tested models performed well for all images. For preference, the model proposed by Iizuka et al based on ImageNet performed better while for perceived similarity the models proposed by Zhang et al and Iizuka et al, also based on ImageNet, outperformed the models of Larsson et al and Iizuka et al which were based on the Places dataset. Due to the fact that many objects have instances of distinct colour, a colorization algorithm cannot correctly reconstruct ground truth image for most gray level images, although it was found that perceived similarity and preference ratings of observers were correlated. In addition, it was found that the tested objective metrics correlated poorly with the subjective judgments of the human observers and their performance varied significantly with image content. These findings demonstrate the limitations of current image colorization studies, and it is suggested that due consideration must be given to human visual perception when evaluating the performance of colorization models.

中文翻译:

对具有不同着色模型的彩色图像的主观评估

进行了两个心理物理实验,以评估灰度图像着色模型的性能,并验证在灰度图像着色中采用的客观图像质量指标。用四个着色模型对二十个代表性灰度图像进行着色,并使用三个典型度量(均方根误差(RMSE),峰信噪比(PSNR)和结构相似性指数(SSIM))来表征目标图像的客观质量。彩色图像。40名观察者被要求在配对比较实验中根据主观偏好评估这些图像,并评估感知到的相似性使用七点等级量表在生成的彩色图像和参考彩色图像之间切换。实验结果表明,不同的着色模型和客观指标在不同的场景下表现出不同的性能。每种着色方法都有其自身的优点和缺点,而没有一种测试模型对所有图像都表现良好。作为优先选择,Iizuka等人基于ImageNet提出的模型在感知相似度方面表现更好。Zhang等人和Iizuka等人(同样基于ImageNet)提出的模型优于基于Places数据集的Larsson等人和Iizuka等人的模型。由于发现许多对象具有不同颜色的实例,因此尽管发现观察者的感知相似度和偏好等级是相关的,但是着色算法无法正确重建大多数灰度图像的地面真实图像。此外,还发现,测试的客观指标与人类观察者的主观判断之间的关联性很差,并且其性能随图像内容的不同而有很大差异。这些发现证明了当前图像着色研究的局限性,
更新日期:2020-11-20
down
wechat
bug