Visual privacy-preserving level evaluation for multilayer compressed sensing model using contrast and salient structural features
Introduction
With the rapid development of multimedia and the Internet, a massive number of images and videos are generated and distributed each day. Therefore, it is important to ensure data security and privacy. Since signal and image processing applications often involve user-related data, privacy-preserving concern has been raised widely [1], [2]. Digital multimedia such as images and videos can be viewed as containing privacy-sensitive information in many service-based applications, which can be leaked and abused easily if the collection, processing, or storage is performed improperly [3]. Some existing image privacy-preserving methods are applied in the form of image encryption in cloud computing or image transmission [4], [5].
Visual data expose significant information about individuals appearing in images and videos [6]. Based on human vision, visual invisibility can maximize the concealment of visual information in images and videos. In recent years, image recognition under visual privacy protection has received increasing attention. Chen et al. [7] presented a novel architecture that combines a VAE (Variational Auto-Encoder) and a GAN (Generative Adversarial Network) to create an identity-invariant representation of a face-containing image for privacy-preserving of facial expression recognition and face image synthesis. Wu et al. [8] proposed an adversarial training framework by explicitly learning a degradation transform for the original video inputs, which led to an adaptive and end-to-end manageable pipeline for privacy-preserving visual recognition. Zhang et al. [5] proposed a secure and efficient outsourcing protocol for face recognition through principal component analysis. However, these methods are based on original images or videos, i.e., these methods use visual privacy-protected or anonymous images or videos derived from original images or videos for processing and recognition.
In some scenarios, we desire cameras to directly capture as little privacy-sensitive information as possible while preserving the identification information to the greatest extent. Therefore, directly collecting low-resolution or low-quality images and videos and using them for recognition and classification tasks is an effective method. Ryoo et al. [9] introduced an inverse super-resolution method to improve classification performance on the extreme low-resolution videos to address human activity recognition while only using extreme low-resolution anonymized videos. Chou et al. [10] used low-resolution depth images to remove privacy-relevant information while still retaining the activity-recognition utility. These methods assume that processed low-quality or low-resolution images have achieved visual privacy protection. The premise of recognition is the image has been visual privacy-protected. Therefore, we need to know whether the image is visually privacy-protected by assessing its visual privacy-preserving level. Although the above methods can be believed that they can perform recognition tasks under visual privacy-protection to some extent, it is still desired to assess the level of image visual privacy-preserving to determine whether visual privacy protection has truly been achieved.
As is well-known, image quality significantly affects the performance of recognition algorithms. For example, using multiple images can enhance recognition performance; however, this introduces an additional computational burden. Therefore, predicting whether an image is good for recognition is of great importance for real application scenarios, where a sequence of images are always presented and the image frame with the best quality should be selected for the subsequent matching and recognition tasks [11]. Therefore, it is meaningful to develop image quality assessment (IQA) algorithms. Objective IQA metrics can be divided into three types based on the amount of information obtained from a reference image: full-reference (FR), reduced-reference (RR), and blind/no-reference (NR). In practice, there may be no approach to attain original reference images; hence the FR and RR methods are infeasible, and the NR methods are desired. To achieve the best prediction performance, IQA metrics attempt to model various processing mechanisms of the human visual system (HVS), such as contrast sensitivity [12] and structural information [13].
Similar to image quality assessment, when using visual privacy-protected images for processing and recognition, image visual privacy-preserving level evaluation becomes important, as it can provide guidance for recognition tasks when the image content is visually invisible or indistinguishable. Padilla-López et al. [14] presented a privacy scheme that using visualization level to display the real image in different ways for privacy-preserving, as evaluated by whether participants could extract the requested information from the images. To evaluate a globally applied privacy filter based on cartooning, Erdélyi et al. [15] employed the structural similarity (SSIM) [13], peak signal to noise ratio (PSNR), standard Viola–Jones face detector, and three different face recognizers to assess its performance. The visual privacy-preserving level evaluation is conceptually similar to visual security evaluation in the field of image encryption and steganography. The latter research is usually based on image quality assessment methods, and experimental results are generally compared with PSNR, SSIM, and other image quality assessment algorithms or subjective analysis [16]. The local feature based visual security metric (LFBVS) was introduced in [17] and utilizes localized edges and luminance features that are combined and weighted based on their error magnitudes. However, this approach is an FR metric, meaning it utilizes information from the original and test (encrypted) image to assess the visual similarity.
Hofbauer and Uhl [18] found that the performance in the encrypted domain, during the evaluation of the application domain, gives a strong indication that none of the tested image metrics can perform the task of evaluating the content confidentiality, including PSNR, SSIM, LFBVS, and other security metrics. Thus, only subjective analysis of visual effects, image quality, or recognition rate of standard detection and recognition methods is considered, which are insufficient to reflect the image visual privacy-preserving level and perceptual security. The support vector regression (SVR) [19] is commonly adopted in IQA to map from the feature vector to the subjective quality scores . Therefore, based on the fuzzy set and membership theory in cybernetics and a two-stage privacy-preserving collaborative fuzzy clustering scheme proposed by Lyu et al. [20], the image quality scores can be used as the fuzzy interval and an appropriate membership function is selected to ensure the interval boundary is fuzzy to map the image quality scores to the visual privacy-preserving scores.
The above concerns motivate the proposed efficient visual privacy-preserving level evaluation for images in the multilayer compressed sensing (MCS) model, i.e., MCS-VPLE, which provides guidance for recognition tasks under privacy-preserving. Specifically, we construct three MCS visual privacy-preserving level evaluation databases to evaluate the proposed method. The experimental results demonstrate that our metric can achieve a remarkable performance of prediction monotonicity and accuracy on the constructed databases. The main contributions of the work are summarized below.
Based on the theory of compressed sensing (CS), we propose an image MCS model that utilizes an improved Gaussian random measurement matrix to sample and encode images for recognition under privacy-preserving. In the MCS model, an input image and a measurement matrix of the same size are divided into 2 × 2 blocks, and the inner product operation is performed on each image block to obtain the next layer of CS image. To avoid the feature loss problem and to ensure the consistency of the sampled data, an improved Gaussian random measurement matrix is presented via translation transforming all the elements in the Gaussian random measurement matrix and normalization.
When extracting the contrast feature, a contrast measurement model CAAME (Color/Cube Asymmetric Alpha-trimmed MeanEnhancement Contrast Measure), in which the statistical mean of the asymmetric alpha-trimmed filter is used instead of the conventional statistical mean in the CRME (Color/Cube Root Mean Enhancement). When extracting the salient structural feature, we propose a salient generalized center-symmetric local binary pattern (SGCS-LBP) operator. The histogram of the texture map is obtained from the GCS-LBP (Generalized Center-Symmetric LBP) operator and each LBP label is weighted according to its saliency, which is obtained from the visual attention computational model GBVS (Graph-Based Visual Saliency). We compute the statistical histogram variation trend value (HVTV) as the salient structural feature based on the weighted histogram.
To avoid “hard” grading, the SVR and fuzzy c-means (FCM) algorithm are used in series to map the above two extracted features to the final evaluated image visual privacy-preserving score. The FCM is used to classify the predicted quality score of the test image obtained from the SVR model and the subjective quality scores of all images in the training set. The final visual privacy-preserving score is the statistical average of the subjective privacy-preserving scores of the training images in the category of the test image.
The rest of this paper is organized as follows. Section 2 gives a brief overview of the CS, contrast, and LBP. In Section 3, we introduce the framework of the MCS model and illustrate the details of the proposed visual privacy-preserving level evaluation method. Experimental results and comparisons are discussed in Section 4. Finally, Section 5 concludes the paper.
Section snippets
Compressed sensing
In recent years, the CS [21], [22], [23], [24] has attempted to reconstruct a sparse signal from a small number of linear measurements and efficiently introduce benefits to image transmission and processing. The CS process can capture and recover signals at a sub-Nyquist rate when they are sparse enough in some domain and the sensing matrix satisfies the restricted isometry property (RIP) [25]. Therefore, the CS can be considered for compression-encryption applications due to its low linear
Notation
The scalars, vectors, and matrices are denoted as normal letters, bold lowercase, and bold capital letters, respectively. Significant notations that appear in this paper are listed in Table 1.
Framework
Our proposed scheme is based on the HVS’s sensitivity to contrast and structure degradation. A flowchart of our proposed method is illustrated in Fig. 1. We build an MCS model where images are dimension-reduced to gradually achieve visual privacy-preserving as the number of layers increases. Each CS
Database description
The research on the objective evaluation of the image visualprivacy-preserving level lacks public and consolidated databases. The research objects and content for some known public and available privacy assessment databases are different, such as the PEViD [76] for videos, the database in [77] for encrypted videos, the PEViD-HDR [78] for high dynamic range videos, and others. However, this paper studies the visual privacy-preserving level for images in the MCS model, in which the
Conclusion
Considering the difficulty of balancing recognition tasks and privacy protection, this paper proposes an image MCS model based on an improved Gaussian random measurement matrix. Furthermore, we propose a visual privacy-preserving level evaluation method for MCS images, inspired by the fact that HVS is sensitive to image contrast and structure information. In particular, to better exploit such features, a contrast measurement model based on the statistical mean of asymmetric alpha-trimmed filter
CRediT authorship contribution statement
Jixin Liu: Conceptualization, Methodology, Writing - original draft, Validation. Zheng Tang: Investigation, Data curation, Software, Validation. Ning Sun: Writing - review & editing, Visualization. Guang Han: Writing - review & editing, Supervision. Sam Kwong: Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by funds from the Provincial Natural Science Foundation of the Science and Technology Bureau of Jiangsu Province, China (Grant No. BK20180088), the China Postdoctoral Science Foundation (Grant No. 2019M651916), the Scientific Research Foundation of Nanjing University of Posts and Telecommunications, China (Grant No. NY218066), the Postgraduate Research & Practice Innovation Program of Jiangsu Province, China (Grant No. KYCX18_0919) and the Natural Science Foundation of
References (97)
- et al.
Secure and privacy-preserving data sharing in the cloud based on lossless image coding
Signal Process.
(2018) - et al.
A high-capacity reversible data hiding method for homomorphic encrypted images
J. Vis. Commun. Image Represent.
(2019) - et al.
Recognition oriented facial image quality assessment via deep convolutional neural network
Neurocomputing
(2019) - et al.
Identifying deficits of visual security metrics for images
Signal Process., Image Commun.
(2016) - et al.
Privacy-preserving collaborative fuzzy clustering
Data Knowl. Eng.
(2018) - et al.
A compressive sensing based privacy preserving outsourcing of image storage and identity authentication service in cloud
Inform. Sci.
(2017) - et al.
Efficiently and securely outsourcing compressed sensing reconstruction to a cloud
Inform. Sci.
(2019) - et al.
Multiple-target tracking based on compressed sensing in the internet of things
J. Netw. Comput. Appl.
(2018) - et al.
A visually secure image encryption scheme based on semi-tensor product compressed sensing
Signal Process.
(2020) - et al.
Safety for pedestrian recognition in sensor networks based on visual compressive sensing and adaptive prediction clustering
Saf. Sci.
(2019)
Referenceless quality metric of multiply-distorted images based on structural degradation
Neurocomputing
Visual acuity inspired saliency detection by using sparse features
Inform. Sci.
Image encryption using chaotic logistic map
Image Vis. Comput.
Thresholding for edge detection using human psychovisual phenomena
Pattern Recognit. Lett.
Image database TID2013: Peculiarities, results and perspectives
Signal Process., Image Commun.
No-reference image quality assessment based on spatial and spectral entropies
Signal Process., Image Commun.
Blind image quality assessment by relative gradient statistics and adaboosting neural network
Signal Process., Image Commun.
No-reference color image quality assessment: From entropy to perceptual quality
EURASIP J. Image Video Process.
Secure signal processing using fully homomorphic encryption
Cryptoimg: Privacy preserving processing over encrypted images
Secure and efficient outsourcing of PCA-based face recognition
IEEE Trans. Inf. Forensics Secur.
A vision-based system for intelligent monitoring: human behaviour analysis and privacy by context
Sensors
VGAN-based image representation learning for privacy-preserving facial expression recognition
Towards privacy-preserving visual recognition via adversarial training: A pilot study
Privacy-preserving human activity recognition from extreme low resolution
Privacy-preserving action recognition for smart hospitals using low-resolution depth images
Application of a noise-adaptive contrast sensitivity function to image data compression
Opt. Eng.
Image quality assessment: from error visibility to structural similarity
IEEE Trans. Image Process.
Visual privacy by context: Proposal and evaluation of a level-based visualisation scheme
Sensors
Adaptive cartooning for privacy protection in camera networks
Perceptual visual security index based on edge and texture similarities
IEEE Trans. Inf. Forensics Secur.
Visual security evaluation for video encryption
A tutorial on support vector regression
Stat. Comput.
Quantitative robust uncertainty principles and optimally sparse decompositions
Found. Comput. Math.
Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information
IEEE Trans. Inform. Theory
Near-optimal signal recovery from random projections: universal encoding strategies?
IEEE Trans. Inform. Theory
Compressed sensing
IEEE Trans. Inform. Theory
Decoding by linear programming
IEEE Trans. Inform. Theory
A low-overhead, confidentiality-assured, and authenticated data acquisition framework for IoT
IEEE Trans. Ind. Inf.
Serious challenges and potential solutions for the industrial internet of things with edge intelligence
IEEE Netw.
Face recognition on partially occluded images using compressed sensing
Pattern Recognit. Lett.
Signal processing with compressive measurements
IEEE J. Sel. Top. Signal Process.
Chaotic cellular automaton for generating measurement matrix used in CS coding
IET Signal Process.
S3: A spectral and spatial measure of local perceived sharpness in natural images
IEEE Trans. Image Process.
A fast approach for no-reference image sharpness assessment based on maximum local variation
IEEE Signal Process. Lett.
Contrast in complex images
J. Opt. Soc. Am.
Visual morphology
The design of wavelets for image enhancement and target detection
Cited by (5)
Visual video evaluation association modeling based on chaotic pseudo-random multi-layer compressed sensing for visual privacy-protected keyframe extraction
2023, Journal of Visual Communication and Image RepresentationCitation Excerpt :Therefore, it could be used in image quality evaluation. Meanwhile, our previous work [20] showed that, it could also be appropriate for image privacy protection evaluation. Within a certain range, the greater the score, the better the degree of privacy protection.
The extraction of pixel-wise visual multi-cues for AHP-based privacy measurement
2022, OptikCitation Excerpt :Thus, many content-based image understanding methods [1–4] and visual processing technologies [5–7] have been proposed to analyze the visual information of images. More and more researchers are trying to use the visual information of images to evaluate privacy threat to the users [8,9]. Visual privacy measurement is a computable evaluation method to quantitatively describe privacy-sensitive content in images or videos and has gradually become a promising and valuable research topic in the field of image analysis and understanding.
Reliable metrics-based linear regression model for multilevel privacy measurement of face instances
2022, IET Image Processing