Visual privacy-preserving level evaluation for multilayer compressed sensing model using contrast and salient structural features

https://doi.org/10.1016/j.image.2020.115996Get rights and content

Highlights

  • We propose a MCS model to balance visual privacy-preserving and recognition.

  • An improved Gaussian random measurement matrix are adopted in MCS model.

  • Our MCS visual evaluation is based on contrast and saliency structural features.

  • The proposed method acquires desirable performances on threeconstructed datasets.

Abstract

Recognition and classification tasks in images or videos are ubiquitous, but they can lead to privacy issues. People increasingly hope that camera systems can record and recognize important events and objects, such as real-time recording of traffic conditions and accident scenes, elderly fall detection, and in-home monitoring. However, people also want to ensure these activities do not violate the privacy of users or others. The sparse representation classification and recognition algorithms based on compressed sensing (CS) are robust at recognizing human faces from frontal views with varying expressions and illuminations, as well as occlusions and disguises. This is a potential way to perform recognition tasks while preserving visual privacy. In this paper, an improved Gaussian random measurement matrix is adopted in the proposed multilayer CS (MCS) model to realize multiple image CS and achieve a balance between visual privacy-preserving and recognition tasks. The visual privacy-preserving level evaluation for MCS images has important guiding significance for image processing and recognition. Therefore, we propose an image visual privacy-preserving level evaluation method for the MCS model (MCS-VPLE) based on contrast and salient structural features. The basic concept is to use the contrast measurement model based on the statistical mean of the asymmetric alpha-trimmed filter and the salient generalized center-symmetric local binary pattern operator to extract contrast and salient structural features, respectively. The features are fed into a support vector regression to obtain the image quality score, and the fuzzy c-means algorithm is used for clustering to obtain the final evaluated image visual privacy-preserving score. Experiments on three constructed databases show that the proposed method has better prediction effectiveness and performance than conventional methods.

Introduction

With the rapid development of multimedia and the Internet, a massive number of images and videos are generated and distributed each day. Therefore, it is important to ensure data security and privacy. Since signal and image processing applications often involve user-related data, privacy-preserving concern has been raised widely [1], [2]. Digital multimedia such as images and videos can be viewed as containing privacy-sensitive information in many service-based applications, which can be leaked and abused easily if the collection, processing, or storage is performed improperly [3]. Some existing image privacy-preserving methods are applied in the form of image encryption in cloud computing or image transmission [4], [5].

Visual data expose significant information about individuals appearing in images and videos [6]. Based on human vision, visual invisibility can maximize the concealment of visual information in images and videos. In recent years, image recognition under visual privacy protection has received increasing attention. Chen et al. [7] presented a novel architecture that combines a VAE (Variational Auto-Encoder) and a GAN (Generative Adversarial Network) to create an identity-invariant representation of a face-containing image for privacy-preserving of facial expression recognition and face image synthesis. Wu et al. [8] proposed an adversarial training framework by explicitly learning a degradation transform for the original video inputs, which led to an adaptive and end-to-end manageable pipeline for privacy-preserving visual recognition. Zhang et al. [5] proposed a secure and efficient outsourcing protocol for face recognition through principal component analysis. However, these methods are based on original images or videos, i.e., these methods use visual privacy-protected or anonymous images or videos derived from original images or videos for processing and recognition.

In some scenarios, we desire cameras to directly capture as little privacy-sensitive information as possible while preserving the identification information to the greatest extent. Therefore, directly collecting low-resolution or low-quality images and videos and using them for recognition and classification tasks is an effective method. Ryoo et al. [9] introduced an inverse super-resolution method to improve classification performance on the extreme low-resolution videos to address human activity recognition while only using extreme low-resolution anonymized videos. Chou et al. [10] used low-resolution depth images to remove privacy-relevant information while still retaining the activity-recognition utility. These methods assume that processed low-quality or low-resolution images have achieved visual privacy protection. The premise of recognition is the image has been visual privacy-protected. Therefore, we need to know whether the image is visually privacy-protected by assessing its visual privacy-preserving level. Although the above methods can be believed that they can perform recognition tasks under visual privacy-protection to some extent, it is still desired to assess the level of image visual privacy-preserving to determine whether visual privacy protection has truly been achieved.

As is well-known, image quality significantly affects the performance of recognition algorithms. For example, using multiple images can enhance recognition performance; however, this introduces an additional computational burden. Therefore, predicting whether an image is good for recognition is of great importance for real application scenarios, where a sequence of images are always presented and the image frame with the best quality should be selected for the subsequent matching and recognition tasks [11]. Therefore, it is meaningful to develop image quality assessment (IQA) algorithms. Objective IQA metrics can be divided into three types based on the amount of information obtained from a reference image: full-reference (FR), reduced-reference (RR), and blind/no-reference (NR). In practice, there may be no approach to attain original reference images; hence the FR and RR methods are infeasible, and the NR methods are desired. To achieve the best prediction performance, IQA metrics attempt to model various processing mechanisms of the human visual system (HVS), such as contrast sensitivity [12] and structural information [13].

Similar to image quality assessment, when using visual privacy-protected images for processing and recognition, image visual privacy-preserving level evaluation becomes important, as it can provide guidance for recognition tasks when the image content is visually invisible or indistinguishable. Padilla-López et al. [14] presented a privacy scheme that using visualization level to display the real image in different ways for privacy-preserving, as evaluated by whether participants could extract the requested information from the images. To evaluate a globally applied privacy filter based on cartooning, Erdélyi et al. [15] employed the structural similarity (SSIM) [13], peak signal to noise ratio (PSNR), standard Viola–Jones face detector, and three different face recognizers to assess its performance. The visual privacy-preserving level evaluation is conceptually similar to visual security evaluation in the field of image encryption and steganography. The latter research is usually based on image quality assessment methods, and experimental results are generally compared with PSNR, SSIM, and other image quality assessment algorithms or subjective analysis [16]. The local feature based visual security metric (LFBVS) was introduced in [17] and utilizes localized edges and luminance features that are combined and weighted based on their error magnitudes. However, this approach is an FR metric, meaning it utilizes information from the original and test (encrypted) image to assess the visual similarity.

Hofbauer and Uhl [18] found that the performance in the encrypted domain, during the evaluation of the application domain, gives a strong indication that none of the tested image metrics can perform the task of evaluating the content confidentiality, including PSNR, SSIM, LFBVS, and other security metrics. Thus, only subjective analysis of visual effects, image quality, or recognition rate of standard detection and recognition methods is considered, which are insufficient to reflect the image visual privacy-preserving level and perceptual security. The support vector regression (SVR) [19] is commonly adopted in IQA to map from the feature vector to the subjective quality scores . Therefore, based on the fuzzy set and membership theory in cybernetics and a two-stage privacy-preserving collaborative fuzzy clustering scheme proposed by Lyu et al. [20], the image quality scores can be used as the fuzzy interval and an appropriate membership function is selected to ensure the interval boundary is fuzzy to map the image quality scores to the visual privacy-preserving scores.

The above concerns motivate the proposed efficient visual privacy-preserving level evaluation for images in the multilayer compressed sensing (MCS) model, i.e., MCS-VPLE, which provides guidance for recognition tasks under privacy-preserving. Specifically, we construct three MCS visual privacy-preserving level evaluation databases to evaluate the proposed method. The experimental results demonstrate that our metric can achieve a remarkable performance of prediction monotonicity and accuracy on the constructed databases. The main contributions of the work are summarized below.

Based on the theory of compressed sensing (CS), we propose an image MCS model that utilizes an improved Gaussian random measurement matrix to sample and encode images for recognition under privacy-preserving. In the MCS model, an input image and a measurement matrix of the same size are divided into 2 × 2 blocks, and the inner product operation is performed on each image block to obtain the next layer of CS image. To avoid the feature loss problem and to ensure the consistency of the sampled data, an improved Gaussian random measurement matrix is presented via translation transforming all the elements in the Gaussian random measurement matrix and normalization.

When extracting the contrast feature, a contrast measurement model CAAME (Color/Cube Asymmetric Alpha-trimmed MeanEnhancement Contrast Measure), in which the statistical mean of the asymmetric alpha-trimmed filter is used instead of the conventional statistical mean in the CRME (Color/Cube Root Mean Enhancement). When extracting the salient structural feature, we propose a salient generalized center-symmetric local binary pattern (SGCS-LBP) operator. The histogram of the texture map is obtained from the GCS-LBP (Generalized Center-Symmetric LBP) operator and each LBP label is weighted according to its saliency, which is obtained from the visual attention computational model GBVS (Graph-Based Visual Saliency). We compute the statistical histogram variation trend value (HVTV) as the salient structural feature based on the weighted histogram.

To avoid “hard” grading, the SVR and fuzzy c-means (FCM) algorithm are used in series to map the above two extracted features to the final evaluated image visual privacy-preserving score. The FCM is used to classify the predicted quality score of the test image obtained from the SVR model and the subjective quality scores of all images in the training set. The final visual privacy-preserving score is the statistical average of the subjective privacy-preserving scores of the training images in the category of the test image.

The rest of this paper is organized as follows. Section 2 gives a brief overview of the CS, contrast, and LBP. In Section 3, we introduce the framework of the MCS model and illustrate the details of the proposed visual privacy-preserving level evaluation method. Experimental results and comparisons are discussed in Section 4. Finally, Section 5 concludes the paper.

Section snippets

Compressed sensing

In recent years, the CS [21], [22], [23], [24] has attempted to reconstruct a sparse signal from a small number of linear measurements and efficiently introduce benefits to image transmission and processing. The CS process can capture and recover signals at a sub-Nyquist rate when they are sparse enough in some domain and the sensing matrix satisfies the restricted isometry property (RIP) [25]. Therefore, the CS can be considered for compression-encryption applications due to its low linear

Notation

The scalars, vectors, and matrices are denoted as normal letters, bold lowercase, and bold capital letters, respectively. Significant notations that appear in this paper are listed in Table 1.

Framework

Our proposed scheme is based on the HVS’s sensitivity to contrast and structure degradation. A flowchart of our proposed method is illustrated in Fig. 1. We build an MCS model where images are dimension-reduced to gradually achieve visual privacy-preserving as the number of layers increases. Each CS

Database description

The research on the objective evaluation of the image visualprivacy-preserving level lacks public and consolidated databases. The research objects and content for some known public and available privacy assessment databases are different, such as the PEViD [76] for videos, the database in [77] for encrypted videos, the PEViD-HDR [78] for high dynamic range videos, and others. However, this paper studies the visual privacy-preserving level for images in the MCS model, in which the

Conclusion

Considering the difficulty of balancing recognition tasks and privacy protection, this paper proposes an image MCS model based on an improved Gaussian random measurement matrix. Furthermore, we propose a visual privacy-preserving level evaluation method for MCS images, inspired by the fact that HVS is sensitive to image contrast and structure information. In particular, to better exploit such features, a contrast measurement model based on the statistical mean of asymmetric alpha-trimmed filter

CRediT authorship contribution statement

Jixin Liu: Conceptualization, Methodology, Writing - original draft, Validation. Zheng Tang: Investigation, Data curation, Software, Validation. Ning Sun: Writing - review & editing, Visualization. Guang Han: Writing - review & editing, Supervision. Sam Kwong: Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by funds from the Provincial Natural Science Foundation of the Science and Technology Bureau of Jiangsu Province, China (Grant No. BK20180088), the China Postdoctoral Science Foundation (Grant No. 2019M651916), the Scientific Research Foundation of Nanjing University of Posts and Telecommunications, China (Grant No. NY218066), the Postgraduate Research & Practice Innovation Program of Jiangsu Province, China (Grant No. KYCX18_0919) and the Natural Science Foundation of

References (97)

  • DaiT. et al.

    Referenceless quality metric of multiply-distorted images based on structural degradation

    Neurocomputing

    (2018)
  • FangY. et al.

    Visual acuity inspired saliency detection by using sparse features

    Inform. Sci.

    (2015)
  • PareekN.K. et al.

    Image encryption using chaotic logistic map

    Image Vis. Comput.

    (2006)
  • KunduM.K. et al.

    Thresholding for edge detection using human psychovisual phenomena

    Pattern Recognit. Lett.

    (1986)
  • PonomarenkoN. et al.

    Image database TID2013: Peculiarities, results and perspectives

    Signal Process., Image Commun.

    (2015)
  • LiuL. et al.

    No-reference image quality assessment based on spatial and spectral entropies

    Signal Process., Image Commun.

    (2014)
  • LiuL. et al.

    Blind image quality assessment by relative gradient statistics and adaboosting neural network

    Signal Process., Image Commun.

    (2016)
  • ChenX. et al.

    No-reference color image quality assessment: From entropy to perceptual quality

    EURASIP J. Image Video Process.

    (2019)
  • ShortellT. et al.

    Secure signal processing using fully homomorphic encryption

  • ZiadM.T.I. et al.

    Cryptoimg: Privacy preserving processing over encrypted images

  • ZhangY. et al.

    Secure and efficient outsourcing of PCA-based face recognition

    IEEE Trans. Inf. Forensics Secur.

    (2020)
  • ChaaraouiA. et al.

    A vision-based system for intelligent monitoring: human behaviour analysis and privacy by context

    Sensors

    (2014)
  • ChenJ. et al.

    VGAN-based image representation learning for privacy-preserving facial expression recognition

  • WuZ. et al.

    Towards privacy-preserving visual recognition via adversarial training: A pilot study

  • RyooM.S. et al.

    Privacy-preserving human activity recognition from extreme low resolution

  • ChouE. et al.

    Privacy-preserving action recognition for smart hospitals using low-resolution depth images

    (2018)
  • DalyS.J.

    Application of a noise-adaptive contrast sensitivity function to image data compression

    Opt. Eng.

    (1990)
  • WangZ. et al.

    Image quality assessment: from error visibility to structural similarity

    IEEE Trans. Image Process.

    (2004)
  • Padilla-LópezJ. et al.

    Visual privacy by context: Proposal and evaluation of a level-based visualisation scheme

    Sensors

    (2015)
  • ErdélyiA. et al.

    Adaptive cartooning for privacy protection in camera networks

  • XiangT. et al.

    Perceptual visual security index based on edge and texture similarities

    IEEE Trans. Inf. Forensics Secur.

    (2016)
  • TongL. et al.

    Visual security evaluation for video encryption

  • SmolaA.J. et al.

    A tutorial on support vector regression

    Stat. Comput.

    (2004)
  • CandèsE.J. et al.

    Quantitative robust uncertainty principles and optimally sparse decompositions

    Found. Comput. Math.

    (2006)
  • CandèsE.J. et al.

    Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information

    IEEE Trans. Inform. Theory

    (2006)
  • CandèsE.J. et al.

    Near-optimal signal recovery from random projections: universal encoding strategies?

    IEEE Trans. Inform. Theory

    (2006)
  • DonohoD.L.

    Compressed sensing

    IEEE Trans. Inform. Theory

    (2006)
  • CandèsE.J. et al.

    Decoding by linear programming

    IEEE Trans. Inform. Theory

    (2005)
  • ZhangY. et al.

    A low-overhead, confidentiality-assured, and authenticated data acquisition framework for IoT

    IEEE Trans. Ind. Inf.

    (2019)
  • ZhangY. et al.

    Serious challenges and potential solutions for the industrial internet of things with edge intelligence

    IEEE Netw.

    (2019)
  • AndrésA.M. et al.

    Face recognition on partially occluded images using compressed sensing

    Pattern Recognit. Lett.

    (2014)
  • DavenportM.A. et al.

    Signal processing with compressive measurements

    IEEE J. Sel. Top. Signal Process.

    (2010)
  • LiuJ.-x. et al.

    Chaotic cellular automaton for generating measurement matrix used in CS coding

    IET Signal Process.

    (2016)
  • VuC.T. et al.

    S3: A spectral and spatial measure of local perceived sharpness in natural images

    IEEE Trans. Image Process.

    (2011)
  • BahramiK. et al.

    A fast approach for no-reference image sharpness assessment based on maximum local variation

    IEEE Signal Process. Lett.

    (2014)
  • PeliE.

    Contrast in complex images

    J. Opt. Soc. Am.

    (1990)
  • AgaianS.S.

    Visual morphology

  • DelMarcoS. et al.

    The design of wavelets for image enhancement and target detection

  • Cited by (5)

    • Visual video evaluation association modeling based on chaotic pseudo-random multi-layer compressed sensing for visual privacy-protected keyframe extraction

      2023, Journal of Visual Communication and Image Representation
      Citation Excerpt :

      Therefore, it could be used in image quality evaluation. Meanwhile, our previous work [20] showed that, it could also be appropriate for image privacy protection evaluation. Within a certain range, the greater the score, the better the degree of privacy protection.

    • The extraction of pixel-wise visual multi-cues for AHP-based privacy measurement

      2022, Optik
      Citation Excerpt :

      Thus, many content-based image understanding methods [1–4] and visual processing technologies [5–7] have been proposed to analyze the visual information of images. More and more researchers are trying to use the visual information of images to evaluate privacy threat to the users [8,9]. Visual privacy measurement is a computable evaluation method to quantitatively describe privacy-sensitive content in images or videos and has gradually become a promising and valuable research topic in the field of image analysis and understanding.

    View full text