Fair comparison of skin detection approaches on publicly available datasets

doi:10.1016/j.eswa.2020.113677

Expert Systems with Applications

Volume 160, 1 December 2020, 113677

https://doi.org/10.1016/j.eswa.2020.113677 Get rights and content

Highlights

•
Skin detection methods and applications are comprehensively reviewed.
•
A rough classification of the methods tested in this work is proposed.
•
Most common benchmarks and testing protocols are discussed and summarized.
•
Performances are evaluated on 10 datasets using 14 methods.
•
Insightful discussions and prospects for future work are given.

Abstract

Skin detection is the process of discriminating skin and non-skin regions in a digital image and it is widely used in several applications ranging from hand gesture analysis to track body parts and face detection. Skin detection is a challenging problem which has drawn extensive attention from the research community in the context of expert and intelligent systems, nevertheless a fair comparison among approaches is very difficult due to the lack of a common benchmark and a unified testing protocol. In the recent era, the success of deep convolutional neural network (CNN) has strongly influenced the field of image segmentation and gave us various successful models to date. Anyway, due to the lack of large ground truth for skin detection only few works have addressed the skin detection problem using CNN models. In this work, we investigate the most recent researches in this field, and we propose a fair comparison among approaches using several different datasets.

The major contributions of this work are (i) an exhaustive literature review of skin color detection approaches and a comparison of approaches that can be useful to researchers and practitioners to select the most suitable method for their application, (ii) the collection and examination of many datasets with ground truth for skin detection that can be useful to produce a training set for CNN models, (iii) a framework to evaluate and combine different skin detector approaches, whose source code is made freely available for future research, and (iv) an extensive experimental comparison among several recent methods which have also been used to define an ensemble that works well in many different problems.

Experiments are carried out in 10 different datasets including more than 10,000 labelled images: experimental results confirm that the best method here proposed obtains a very good performance with respect to other stand-alone approaches, without requiring ad hoc parameter tuning.

A MATLAB version of the framework for testing and of the methods proposed in this paper will be freely available from https://github.com/LorisNanni.

Introduction

Skin texture and color are important signs that people use to understand variety of culture-related aspects about each other, as: health, ethnicity, age, beauty, wealth and so on. The presence of skin color in images and videos is a signal of the presence of humans in such media. Therefore, in the last two decades extensive research in the context of expert and intelligent systems has focused on skin detection in videos and images. Skin detection is the process of discriminating “skin” and “non-skin” regions in a digital image and consists in performing a binary classification of pixels and in executing a fine segmentation to define the boundaries of the skin regions. Currently, skin detection is a sophisticated process involving not only the training of models, but also numerous additional methods, including data pre-processing and post-processing.

Skin detection is used within many application domains: it is used as a preliminary step for face detection (Hsu, Abdel-Mottaleb, & Jain, 2002) and tracking (De-La-Torre, Granger, Radtke, Sabourin, & Gorodnichy, 2015), body tracking (Argyros & Lourakis, 2004), hand detection (Roy, Mohanty, & Sahay, 2017) and gesture recognition (Han, Award, Sutherland, & Wu, 2006), biometric authentication (i.e. palm print recognition) (Sang, Ma, & Huang, 2013), objectionable content filtering (Lee, Kuo, Chung, & Chen, 2007), medical imaging. In this work a comprehensive analysis is carried out of how different expert systems (including artificial intelligence, deep learning, and machine learning systems) are designed in order to deal with the skin detection problem.

A useful feature for the discrimination of skin and non-skin pixels is the pixel color; nevertheless, obtaining skin color consistency across variations in illumination, diverse ethnicity and different acquisition devices is a very challenging task. Moreover, skin detection, when used as preliminary step of other applications, is required to be computationally efficient, invariant against geometrics transformations, partial occlusions or changes of posture/facial expression, insensitive to complex or pseudo-skin background, robust against the quality of the acquisition device. The factor that worst influences skin detection is the color constancy problem (Khan, Hanbury, Stottinger, & Bais, 2012): i.e. the dependency of pixel intensity on both reflection and illumination which have a nonlinear and unpredictable behavior. To be effective when the illumination conditions vary rapidly, some skin detection approaches use image preprocessing strategies based on color constancy (i.e. a color correction method based on an estimate of the illuminant color) and/or dynamic adaptation techniques (i.e. the transformation of a skin-color model according to the changing illumination conditions). Static skin color approaches that rely on image preprocessing can only partially solve this problem and their performance strongly degrades in real-world applications. A possible solution is considering additional data acquired out the visual spectrum (i.e. infrared images (Kong, Heo, Abidi, Paik, & Abidi, 2005) or spectral imaging (Healey, Prasad, & Tromberg, 2003)), however the use of such sensors is not appropriate for all applications and requires higher acquisition costs which limit their use to specific problems.

Skin detection is a challenging problem and has been extensive studied from the research community. Despite the large number of methods, there are only few surveys in this topic: the works in (Kakumanu et al., 2007, Prema and Manimegalai, 2012) are quite old and cover only the methods proposed before 2005, the surveys in (Chen et al., 2016, Mahmoodi and Sayedi, 2016, Naji et al., 2018) are more recent and contain a good investigation of methods, benchmarking datasets and performance related to a period of about two decades. Anyway in none of the above surveys is there a fair comparison among methods using the same testing protocols and datasets. The aim of the present work is not limited to survey the most recent research in this field (which is now enriched of methods based on deep learning (Xu et al., 2015, Zuo et al., 2017, Kim et al., 2017a, Ma and Shih, 2018), but also, and above all, to propose a framework for a fair comparison among approaches.

In this research, a novel framework is proposed that integrates different skin color classification approaches and compare their performance and their combination on several publicly available datasets. The major contributions of this research work are:

•
An exhaustive literature review of skin color detection approaches with a detailed description of methods freely available.
•
The collection and examination of almost all the datasets with ground truth for skin detection available in the literature. Such collection, which includes more than 10,000 labelled images, can be useful to produce a training set for CNN models.
•
A framework to evaluate and combine different skin detector approaches. The source code of the framework and many of the tested methods will be made freely available for future research and comparisons. The system can be tuned according to the target application: on the basis of the application requirement, the acceptance threshold can be tuned to prune a large percentage of false accepts at a small cost of reduction in genuine accepts or vice versa a larger number of false accepts can be admitted to maximize the number of genuine accepts. The framework includes training and testing protocols for most used benchmark datasets in this field.
•
A fair comparison among the most recent research and methods in the skin detection field, using the same testing protocols, benchmark datasets and performance indicators. An evaluation of computation time of each method in order to perform a comparison also in terms of complexity. A discussion about performance can help researchers and practitioners in evaluating the approaches most suited to their requirements according to computational complexity, memory requirements, detection rate and sensitivity.
•
Three different CNN architectures are trained for skin detection and the model are made available.

The arrangement of this paper is as follows. In Section 2 related works in skin detection are presented, including a discussion about taxonomy of existing approaches and a detailed description of the approaches tested in this work. In Section 3 the evaluation problem is treated, the most known datasets used for performance evaluation are listed and commented, testing protocols and performance indicators used in our experiment are discussed. In Section 4 the experiments conducted using the proposed framework are reported and discussed. Finally, Section 5 includes the conclusions and some future research directions.

Section snippets

Skin detection approaches

Several skin detection methods assume that skin color can be recognized from background colors according to some clustering rule in a specific color space. Even if this assumption can be valid in a constrained environment where both ethnicity of the people and background colors are known, it is a very challenging task in complex images captured under unconstrained conditions and when individuals show a large spectrum of human skin coloration (Kakumanu, Makrogiannis, & Bourbakis, 2007). There

Skin detection evaluation: Datasets and performance indicators

To assist research in the area of skin detection, there are some well-known color image datasets provided with ground truth. The use of a standard and representative benchmark is essential to execute a fair empirical evaluation of skin detection techniques.

A fair experimental comparison

A fair comparison among different approaches is very difficult due to the lack of a universal standard in evaluation: most of published works are tested on self-collected datasets which often are not available for further comparison; in many cases the testing protocol is not clearly explained, many datasets are not of high quality and the precision of the ground truth is questionable since sometimes lips, mouth, rings and bracelets have been labelled as skin. In this section, we carry out a

Conclusion and future research directions

In this work a new framework to evaluate and combine different skin detector approaches is presented and an extensive evaluation of several approaches is carried out on 10 different datasets including more than 10,000 labelled images. A survey of most recent existing approaches is carried out, three well-known deep learning models for data segmentation are trained and tested to this classification problem and four new ensembles based on the combination of nine methods (including 3 CNNs) are

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

We would like to acknowledge the support that NVIDIA provided us through the GPU Grant Program. We used a donated TitanX GPU to train CNNs used in this work.

References (66)

H.K. Al-Mohair et al.
Hybrid human skin detection using neural network and K-means clustering technique
Applied Soft Computing
(2015)
N. Brancati et al.
Human skin detection through correlation rules between the YCb and YCr subspaces based on dynamic color clustering
Computer Vision and Image Understanding
(2017)
A. Cheddad et al.
A skin tone detection algorithm for an adaptive approach to steganography
Signal Processing
(2009)
Y.H. Chen et al.
Statistical skin color detection method without color transformation for real-time surveillance systems
Engineering Applications of Artificial Intelligence
(2012)
P. Kakumanu et al.
A survey of skin-color modeling and detection methods
Pattern Recognition
(2007)
M. Kawulok et al.
Spatial-based skin detection using discriminative skin-presence features
Pattern Recognition Letters
(2014)
R. Khan et al.
Color based skin classification
Pattern Recognition Letters
(2012)
S.G. Kong et al.
Recent advances in visual and infrared face recognition – A review
Computer Vision and Image Understanding
(2005)
J.-S. Lee et al.
Naked image detection based on adaptive and extensible skin color model
Pattern Recognition
(2007)
J.C. Sanmiguel et al.
Skin detection by dual maximization of detectors agreement for video monitoring
Pattern Recognition Letters
(2013)

S.J. Schmugge et al.

Objective evaluation of approaches of skin detection using ROC analysis

Computer Vision and Image Understanding

(2007)

A. Angelova et al.

Pruning training sets for learning of object categories

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

(2005)

A.A. Argyros et al.

Real-time tracking of multiple skin-colored objects with a possibly moving camera

Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

(2004)

V. Badrinarayanan et al.

SegNet: A deep convolutional encoder-decoder architecture for image segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2017)

J.P.B. Casati et al.

SFA: A human skin image database based on FERET and AR facial images

(2013)

W.C. Chen et al.

Region-based and content adaptive skin detection in color images

International Journal of Pattern Recognition and Artificial Intelligence

(2007)

W. Chen et al.

Skin color modeling for face detection and segmentation: A review and a new approach

Multimedia Tools and Applications

(2016)

L. Chen et al.

A skin detector based on neural network

L.C. Chen et al.

Encoder-decoder with atrous separable convolution for semantic image segmentation

Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

(2018)

C.Ó. Conaire et al.

Detector adaptation by maximising agreement between independent data sources

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

(2007)

M. De-La-Torre et al.

Partially-supervised learning from facial trajectories for face recognition in video surveillance

Information Fusion

(2015)

Dourado, A., Guth, F., de Campos, T. E., & Li, W. (2019). Domain adaptation for holistic skin detection. CoRR,...

A. Gupta et al.

Robust skin segmentation using color space switching

Pattern Recognition and Image Analysis

(2016)

J. Han et al.

Automatic skin segmentation for gesture recognition combining region and support vector machine active learning

G. Healey et al.

Face recognition in hyperspectral images

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2003)

R.L. Hsu et al.

Face detection in color images

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2002)

L. Huang et al.

Human skin detection in images by MSER analysis

N.B. Ibrahim et al.

A dynamic skin detector based on face skin tone color

S. Jairath et al.

Adaptive skin color model to improve video face detection

Z. Jiang et al.

Skin Detection Using Color, Texture and Space Information

Fourth International Conference on Fuzzy Systems and Knowledge Discovery

(2007)

M.J. Jones et al.

Statistical color models with application to skin detection

International Journal of Computer Vision

(2002)

M. Kawulok

Fast propagation-based skin regions segmentation in color images

M. Kawulok et al.

Self-adaptive algorithm for segmenting skin regions

EURASIP Journal on Advances in Signal Processing

(2014)

Cited by (29)

Efficient hand segmentation for rehabilitation tasks using a convolution neural network with attention
2023, Expert Systems with Applications
We designed an interface to support hand rehabilitation tasks to restore hand function and relieve discomfort. The interface requires accurate hand segmentation, which is impeded by background clutter, occlusion, and variations in illumination. To overcome these challenges, we propose a novel encoder–decoder that segments the hand by encoding spatial and channel correlations using two attention blocks. This approach requires much less computation than benchmark self-attention mechanisms. Moreover, a novel loss function optimizes the model to resolve class imbalance, ensure boundary smoothness, and retain the hand’s shape. The quantitative and qualitative results show the model’s ability to segment the hands. It performed exceptionally well for images with different hand poses and orientations, the presence of a human face, background clutter, specularity, and variations in illumination. The model attained an F1-score of 97.3% for the Ouhands and 99.3% for the HGR dataset, higher than baseline models, with faster inference times. Furthermore, the model could generalize hand segmentation to multiple hands and unseen environments. Its segmentation precision enabled the development of the hand rehabilitation interface, which guided users to perform hand exercises. For five weeks, patients steadily improved hand function while using the interface.
Deep ensembles and data augmentation for semantic segmentation
2023, Diagnostic Biomedical Signal and Image Processing Applications with Deep Learning Methods
The task of classifying each pixel in an image is known as semantic segmentation in the context of computer vision, and it is critical for image analysis in many domains. Semantic segmentation, for example, is required in clinical practice to improve accuracy in identifying potential pathologies, such as polyp segmentation, which provides critical information for detecting colorectal cancer in its early stages. Autoencoder architectures that learn low-level semantical descriptions of an image are commonly used for semantic segmentation. This architecture is made up of an encoder module that generates low-level data representations, which are then used by a second module (the decoder) that learns to rebuild the initial input. We tackle the semantic segmentation process in this chapter by constructing a novel ensemble of convolutional neural networks (CNNs) and transformers. An ensemble is a machine learning method that trains different models to make predictions on a given input, and then aggregates these predictions to compute a final decision. We enforce ensemble diversity by experimenting with various loss functions and data augmentation approaches. We combine DeepLabV3+, HarDNet-MSEG CNN, and Pyramid Vision Transformers to create the proposed ensemble. We present a thorough empirical analysis of our system based on three semantic segmentation problems: polyp detection, skin detection, and leukocyte recognition. Experiments show that our method produces cutting-edge results.
Exploring the Potential of Ensembles of Deep Learning Networks for Image Segmentation
2023, Information (Switzerland)
Improving Existing Segmentators Performance with Zero-Shot Segmentators
2023, Entropy
A color attention mechanism based on YES color space for skin segmentation
2023, Journal of Real-Time Image Processing
What We Teach About Race and Gender: Representation in Images and Text of Children's Books
2023, SSRN

View all citing articles on Scopus

View full text

ReviewFair comparison of skin detection approaches on publicly available datasets

Highlights

Abstract

Introduction

Section snippets

Skin detection approaches

Skin detection evaluation: Datasets and performance indicators

A fair experimental comparison

Conclusion and future research directions

Declaration of Competing Interest

Acknowledgments

Applied Soft Computing

Computer Vision and Image Understanding

Signal Processing

Engineering Applications of Artificial Intelligence

Pattern Recognition

Pattern Recognition Letters

Pattern Recognition Letters

Computer Vision and Image Understanding

Pattern Recognition

Pattern Recognition Letters

Computer Vision and Image Understanding

Pruning training sets for learning of object categories

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Real-time tracking of multiple skin-colored objects with a possibly moving camera

Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SegNet: A deep convolutional encoder-decoder architecture for image segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence

SFA: A human skin image database based on FERET and AR facial images

Region-based and content adaptive skin detection in color images

International Journal of Pattern Recognition and Artificial Intelligence

Skin color modeling for face detection and segmentation: A review and a new approach

Multimedia Tools and Applications

A skin detector based on neural network

Encoder-decoder with atrous separable convolution for semantic image segmentation

Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Detector adaptation by maximising agreement between independent data sources

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Partially-supervised learning from facial trajectories for face recognition in video surveillance

Information Fusion

Robust skin segmentation using color space switching

Pattern Recognition and Image Analysis

Automatic skin segmentation for gesture recognition combining region and support vector machine active learning

Face recognition in hyperspectral images

IEEE Transactions on Pattern Analysis and Machine Intelligence

Face detection in color images

IEEE Transactions on Pattern Analysis and Machine Intelligence

Human skin detection in images by MSER analysis

A dynamic skin detector based on face skin tone color

Adaptive skin color model to improve video face detection

Skin Detection Using Color, Texture and Space Information

Fourth International Conference on Fuzzy Systems and Knowledge Discovery

Statistical color models with application to skin detection

International Journal of Computer Vision

Fast propagation-based skin regions segmentation in color images

Self-adaptive algorithm for segmenting skin regions

EURASIP Journal on Advances in Signal Processing

Review
Fair comparison of skin detection approaches on publicly available datasets