Siamese CNN-based rank learning for quality assessment of inpainted images

doi:10.1016/j.jvcir.2021.103176

Journal of Visual Communication and Image Representation

Volume 78, July 2021, 103176

https://doi.org/10.1016/j.jvcir.2021.103176 Get rights and content

Highlights

•
Siamese CNN-based NR-IIQA, automatically ranking two inpainted images.
•
A patch-wise coherence assessment module to evaluate local color and texture consistency within and around inpainted contents.
•
A dataset containing thousands of inpainted image pairs with ground-truth ranking labels.

Abstract

Existing NR-IIQA (no reference-based inpainted image quality assessment) algorithms assess the quality of an inpainted image via artificially designed unnaturalness expression, which often fail to capture inpainted artifacts. This paper presents a new deep rank learning-based method for NR-IIQA. The model adopts a siamese deep architecture, which takes a pair of inpainted images as input and outputs their rank order. Each branch utilizes a CNN structure to capture the global structure coherence and a patch-wise coherence assessment module (PCAM) to depict the local color and texture consistency in an inpainted image. To train the deep model, we construct a new dataset, which contains thousands of pairs of inpainted images with ground-truth quality ranking labels. Rich ablation studies are conducted to verify the key modules of the proposed architecture. Comparative experimental results demonstrate that our method outperforms existing NR-IIQA metrics in evaluating both inpainted images and inpainting algorithms.

Introduction

Image inpainting is the task of filling holes in images caused by content damaging or unwanted object removing [1], [2], [3]. Image inpainting has received considerable attention and numerous effective inpainting algorithms have been proposed in the past decades. In this paper, we focus on inpainted image quality assessment (IIQA), which is essential in comparison of different inpainting methods.

There are already many IIQA methods. Some of them use fidelity metrics computed by referring to ground-truth images. These metrics, such as structural similarity index measure (SSIM) [4] and peak signal to noise ratio (PSNR), calculate the similarity between inpainted images and their ground-truth references, as the quality scores of the inpainted images. These metrics are precise but lack of practicality, because the original ground-truth images generally do not exist under non-experimental environments. Thereafter, many visual saliency-based IIQA metrics [5] were proposed to evaluate inpainted images. Although they require no ground-truth references, these methods tend to confuse visual salient regions with inpainted unnatural artifacts.

Recently, many no reference (NR) machine learning-based algorithms have been developed. For example, Viacheslav et al. [6] predicted an absolute quality score of an inpainted image via extracting artificially designed features to depict inpainted-and-surrounding inconsistency and training a support vector regression (SVR) model for feature-to-score mapping. Training these algorithms needs sufficient data with accurate ground-truth quality scores. However, it usually costs too much manual effort to score images. In addition, it is hard to provide objective quality scores: the scores given for the same image by different persons or even the same person at different moments are much different. Due to the above reasons, the learning-to-score NR-IIQA algorithms are hard to use in practice.

Considering that the main target of IIQA is to compare inpainted results, NR-IIQA can be solved by ranking, i.e. predicting the preference order of inpainted images. Compared to scoring inpainted images, the ground-truth data for training rank models can be easily obtained even without manual intervention [7], [8]. Unfortunately, existing learning-to-rank IIQA metrics are simply based on artificially designed features and traditional support vector machine (SVM) models [7], [8], which are much limited in inpainted-and-surrounding inconsistency expression and evaluation.

In this paper, we regard IIQA as a pairwise rank learning task as well and propose a new deep NR-IIQA algorithm for the assessment of inpainted images. The proposed model adopts a siamese CNN architecture trained for ranking two inpainted images. Each branch of CNN is designed and trained to encode the inpainted-and-surrounding inconsistency in global and multi-scale local perceptual levels. Training the proposed deep model requires sufficient data. Unfortunately, there is no large public IIQA datasets. In this paper, we construct a pairwise ranking dataset for the task of deep learning-to-rank IIQA. Within this dataset, we provide the quality ranking order of a pair of input inpainted images as ground-truth, which are much easier than manually annotating absolute scores to each training image. Once our model is well trained, it can judge which one in the input pair is better restored, and therefore can be used to rank the performance of inpainting algorithms, two by two.

Our main contributions are as follows:

1.
We propose a siamese CNN-based algorithm for NR-IIQA. To the best of our knowledge, this is the first deep learning-based IIQA model. The proposed model can automatically rank two inpainted images by capturing the multi-grain perceptual consistency within and around inpainted contents.
2.
To address the lack of training data for deep learning-to-rank IIQA, we synthesize a large dataset, containing thousands of inpainted image pairs. We also construct a relatively small real dataset with images repaired by various inpainting algorithms. Our model is trained with the synthetic data plus around half of the real data and tested in the left part of the real dataset.
3.
We conduct rich ablation studies to verify the key modules of the proposed model. We also compare our model with both reference-based metrics and no reference-based methods. Experimental results demonstrate that our model achieves performance closest to reference-based metrics and outperforms existing no reference-based metrics by large margins. In addition, we apply the proposed model to evaluate different inpainting algorithms, in which the proposed model obtains performance closest to the reference-based PSNR metric.

The rest of the paper is organized as follows. Section 2 discusses related works on IIQA metrics and algorithms. The constructed inpainted image dataset with pairwise quality ranking labels is introduced in Section 3. The details of the proposed deep learning-to-rank model for IIQA is described in Section 4. The performance and applications of the proposed model are presented in Section 5. Finally, Section 6 concludes the paper.

Section snippets

Related work

In recent years, a series of quality assessment metrics for inpainted images have been proposed. These metrics can be divided into three categories: similarity-based metrics, visual saliency-based metrics and machine learning-based methods. The first category belongs to full reference (FR) approaches, since it needs corresponding ground-truth images as references to evaluate inpainted ones. The latter two generally require no reference images.

Dataset

A high performance learning-based predicting model usually needs a large amount of training data. Tiefenbacher et al. [24] published, for the first time, a dataset named TUM-IID for the IIQA task. TUM-IID contains 17 original images. Each image has four regions, each of which is completed by four different off-the-shelf inpainting algorithms [25], [26], [27], [28], yielding 272 inpainted images with manually annotated quality scores. Although it has been used for IIQA in some literatures [29],

Method

This section presents the proposed deep model. At first, we overview the model. Then, we describe the core parts of the architecture. Finally, we introduce the loss function used in the training process.

Experiment

In this section, we evaluate the constructed datasets and the proposed model via experiments. In Section 5.1, we experimentally determine the composition of training dataset. Section 5.2 introduces key experimental settings. In Section 5.3, we compare the proposed algorithm with existing IIQA metrics. In Section 5.4, we perform ablation studies. The application of the proposed model in inpainting algorithm evaluation is presented in Section 5.5.

Conclusion

This paper presented a no-reference IIQA deep model. It is the first deep learning-based IIQA metric. The model adopts a siamese convolutional neural network to capture and rank the naturalness of inpainted contents in global structures. In the meanwhile, it encodes the local color and texture consistency in multi-scales via the specifically designed PCAM module. The global–local and multi-feature-types evaluations are then combined to compute an absolute score reflecting the quality of an

CRediT authorship contribution statement

Xiangdong Meng: Methodology, Writing - original draft. Wei Ma: Conceptualization, Methodology, Writing - reviewing and editing. Chunhu Li: Validation, Writing - reviewing and editing. Qing Mi: Validation, Writing - reviewing and editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research is supported by National Natural Science Foundation of China (No. 61771026). It is also supported by the Key Project of Beijing Municipal Education Commission, China (No. KZ201910005008).

References (44)

X. Hong, P. Xiong, R. Ji, H. Fan, Deep fusion network for image completion, in: Proceedings of the 27th ACM...
CriminisiA. et al.
Region filling and object removal by exemplar-based image inpainting
IEEE Trans. Image Process.
(2004)
MaW. et al.
Learning across views for stereo image inpainting
IET Comput. Vis.
(2020)
WangZ. et al.
Image quality assessment: from error visibility to structural similarity
IEEE Trans. Image Process.
(2004)
M.V. Venkatesh, S.C. Sen-ching, Eye tracking based perceptual image inpainting quality analysis, in: 2010 IEEE...
ViacheslavV. et al.
Low-level features for inpainting quality assessment
IsogawaM. et al.
Image quality assessment for inpainted images via learning to rank
Multimedia Tools Appl.
(2019)
IsogawaM. et al.
Which is the better inpainted image? Training data generation without any manual operations
Int. J. Comput. Vis.
(2019)
NazeriK. et al.
EdgeConnect: Generative image inpainting with adversarial edge learning
(2019)
R.A. Yeh, C. Chen, T. Yian Lim, A.G. Schwing, M. Hasegawa-Johnson, M.N. Do, Semantic image inpainting with deep...

X. Snelgrove, High-resolution multi-scale neural texture synthesis, in: SIGGRAPH Asia 2017 Technical Briefs, 2017, pp....

Y. Blau, T. Michaeli, The perception-distortion tradeoff, in: Proceedings of the IEEE Conference on Computer Vision and...

W. Zhu, S. Liang, Y. Wei, J. Sun, Saliency optimization from robust background detection, in: Proceedings of the IEEE...

C. Yang, L. Zhang, H. Lu, X. Ruan, M.H. Yang, Saliency detection via graph-based manifold ranking, in: Proceedings of...

IttiL. et al.

A model of saliency-based visual attention for rapid scene analysis

IEEE Trans. Pattern Anal. Mach. Intell.

(1998)

ArdisP.A. et al.

Visual salience metrics for image inpainting

A.I. Oncu, F. Deger, J.Y. Hardeberg, Evaluation of digital inpainting quality in the context of artwork restoration,...

T.T. Dang, A. Beghdadi, M.C. Larabi, Visual coherence metric for evaluation of color image restoration, in: 2013 Colour...

KingmaD.P. et al.

Adam: A method for stochastic optimization

(2014)

BosseS. et al.

Deep neural networks for no-reference and full-reference image quality assessment

IEEE Trans. Image Process.

(2017)

X. Liu, J. Van De Weijer, A.D. Bagdanov, RankIQA: Learning from rankings for no-reference image quality assessment, in:...

V. Voronin, V. Marchuk, E. Semenishchev, S. Maslennikov, I. Svirin, Inpainted image quality assessment based on machine...

Cited by (0)

^☆: This paper has been recommended for acceptance by Zicheng Liu.

View full text

Journal of Visual Communication and Image Representation

Full length articleSiamese CNN-based rank learning for quality assessment of inpainted images☆

Highlights

Abstract

Introduction

Section snippets

Related work

Dataset

Method

Experiment

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Region filling and object removal by exemplar-based image inpainting

IEEE Trans. Image Process.

Learning across views for stereo image inpainting

IET Comput. Vis.

Image quality assessment: from error visibility to structural similarity

IEEE Trans. Image Process.

Low-level features for inpainting quality assessment

Image quality assessment for inpainted images via learning to rank

Multimedia Tools Appl.

Which is the better inpainted image? Training data generation without any manual operations

Int. J. Comput. Vis.

EdgeConnect: Generative image inpainting with adversarial edge learning

A model of saliency-based visual attention for rapid scene analysis

IEEE Trans. Pattern Anal. Mach. Intell.

Visual salience metrics for image inpainting

Adam: A method for stochastic optimization

Deep neural networks for no-reference and full-reference image quality assessment

IEEE Trans. Image Process.

Full length article
Siamese CNN-based rank learning for quality assessment of inpainted images☆