0% found this document useful (0 votes)
14 views11 pages

Skin Detect2

This paper presents a hybrid human skin detection method that combines a Multilayer Perceptron artificial neural network with k-means clustering to improve accuracy in detecting skin regions in images. The proposed approach utilizes both color and texture features, achieving an F1-measure of 87.82% based on experimental results. The study highlights the limitations of existing skin detection methods and emphasizes the importance of selecting the appropriate color space for effective classification.

Uploaded by

hewei7502
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views11 pages

Skin Detect2

This paper presents a hybrid human skin detection method that combines a Multilayer Perceptron artificial neural network with k-means clustering to improve accuracy in detecting skin regions in images. The proposed approach utilizes both color and texture features, achieving an F1-measure of 87.82% based on experimental results. The study highlights the limitations of existing skin detection methods and emphasizes the importance of selecting the appropriate color space for effective classification.

Uploaded by

hewei7502
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Applied Soft Computing 33 (2015) 337–347

Contents lists available at ScienceDirect

Applied Soft Computing


journal homepage: www.elsevier.com/locate/asoc

Hybrid Human Skin Detection Using Neural Network and K-Means


Clustering Technique
Hani K. Al-Mohair, Junita Mohamad-Saleh ∗ , Shahrel Azmin Suandi
School of Electrical & Electronic Engineering, Universiti Sains Malaysia, 14300 Nibong Tebal, Pulau Pinang, Malaysia

a r t i c l e i n f o a b s t r a c t

Article history: Human skin detection is an essential step in most human detection applications, such as face detec-
Received 5 December 2014 tion. The performance of any skin detection system depends on assessment of two components: feature
Received in revised form 19 April 2015 extraction and detection method. Skin color is a robust cue used for human skin detection. However, the
Accepted 26 April 2015
performance of color-based detection methods is constrained by the overlapping color spaces of skin and
Available online 7 May 2015
non-skin pixels. To increase the accuracy of skin detection, texture features can be exploited as additional
cues. In this paper, we propose a hybrid skin detection method based on YIQ color space and the statis-
Keywords:
tical features of skin. A Multilayer Perceptron artificial neural network, which is a universal classifier, is
Skin color detection
Color space
combined with the k-means clustering method to accurately detect skin. The experimental results show
Neural networks that the proposed method can achieve high accuracy with an F1 -measure of 87.82% based on images from
Texture analysis the ECU database.
k-Means Crown Copyright © 2015 Published by Elsevier B.V. All rights reserved.

1. Introduction some factors, such as illumination, may make skin color detection
a very difficult task. [2]. The existing algorithms can be categorized
Separating an image into regions that consist of groups of into four classes: explicit skin classifiers, parametric classifiers,
identical linked pixels is an image processing stage called image nonparametric classifiers, and dynamic classifiers [8]. The explicit
segmentation. The homogeneity of a region can be defined by classifiers, which are the easiest and are frequently employed, use a
the color, gray levels, and texture, among other factors [1]. Skin threshold strategy to distinguish between skin and non-skin pixels
detection is a good example of image segmentation, which can [9]. Basically, they characterize the limits of the skin region by uti-
be accomplished by classifying image pixels as skin and non-skin lizing a set of fixed thresholds. Although such classifiers are direct
pixels [2]. and might be used without any prior training steps, they may need
The importance of skin color detection comes from its use as adaptability when utilized under distinct imaging conditions. This
a primary operation in many applications, such as face detection may result in incorrect pixel detection [8].
[3], surveillance systems [4], Internet pornographic image filter- Parametric classifiers can be based on a single Gaussian model
ing [5] and gesture analysis [6]. For example, face detection is [10], multiple Gaussian clusters [11], a mixture of Gaussian (MoG)
accomplished by taking out the joint facial characteristics and by models [12], or an elliptic boundary model [13]. Generally, the char-
employing skin color detection as a primary step to specify the face acterization speed of these classifiers is slow. In fact, they need to
area. As a result, accurate and fast face detection can be accom- process each pixel individually. Additionally, these methods have
plished. low detection accuracy, as they rely on approximated parame-
Many researchers have investigated skin color features. Their ters rather than authentic appropriate skin colors [8]. Furthermore,
results have shown that skin color has a limited range of hues and their performance varies depending on the utilized color space
is not deeply saturated [7]. Thus, human skin color is clustered [14].
within a small area in the color space. In the past few years, sev- In the nonparametric classifiers, a set of training data is essential
eral algorithms have been proposed for skin detection. However, for estimating the statistical model of skin color distribution [15].
The advantages of these classifiers are quick training and skin distri-
bution shape independence [14]. Nevertheless, such classifiers are
∗ Corresponding author. Tel.: +60 45996027. not precise enough because of the requirement for an unbounded
E-mail addresses: hamo [email protected] (H.K. Al-Mohair), [email protected] amount of training information, which makes them appropriate in
(J. Mohamad-Saleh). a constrained scope of imaging conditions [8].

http://dx.doi.org/10.1016/j.asoc.2015.04.046
1568-4946/Crown Copyright © 2015 Published by Elsevier B.V. All rights reserved.
338 H.K. Al-Mohair et al. / Applied Soft Computing 33 (2015) 337–347

To overcome the generality of the previous static skin model proposed a novel algorithm based on two output layer neurons, one
classifiers, dynamic classifiers, which are based on artificial neural each for the skin and non-skin classes. The novel method employed
networks (ANN) and/or genetic algorithms, have been proposed the Error Back Propagation Training Algorithm utilizing RGB color
[16]. The flexibility and ability of ANNs to adapt to various image information. In this method, the threshold was used to overcome
conditions make them a good choice for enhancing classification overlapping skin regions.
tasks for human skin pixels [7]. Doukim et al. [16] proposed numerous strategies using a Multi-
Most of the existing skin detection methods segment images Layer Perceptron ANN and the Y component from YCbCr color
using only skin color information. Based on the fact that the skin information. They employed a modified growing method to deter-
region of an image is a group(s) of homogenous connected pixels, mine the number of neurons in the hidden layer. C-HN-O topology
texture information can be used to describe skin regions. Different was employed in this work. A BP algorithm that combined the fea-
texture descriptors, such as homogeneity, uniformity, and standard tures of color and texture was proposed by Taqa and Jalab [17]
deviation, may be exploited for detection purposes [17]. to increase the reliability of the skin classification operation. Sta-
Although color information plays the main role in modeling of tistical models were used to estimate the texture features that
skin, selecting the prober color space to present skin is crucial. increased the skin reliability.
Several comparisons between different color spaces used for skin The mentioned ANN-based techniques performed well in detec-
detection can be found in the literature [18–22], but one important tion accuracy; however, these techniques have some drawbacks.
question still remains unanswered is, “what is the best color space All methods utilized a threshold value with no optimization pro-
for skin detection?” Many authors do not provide strict justifica- cess used to estimate these values, and the authors used the trial
tion of their color space choice. Some of them cannot explain the and error method, which is not an accurate procedure. In addition,
contradicting results between their experiments or experiments of the mentioned algorithms, no real experiment was conducted
reported by others [22]. Moreover, some authors think that select- using the ANN to select the color space. The selection relied on
ing a specific color space is more related to personal taste rather others’ work. In addition to the drawbacks, the negative effect of
than experimental evidences [18]. illumination conditions and near skin color backgrounds represent
In this paper, a hybrid classifier that combines ANN with k- a challenge that that must be overcome in order to develop reliable
means clustering is proposed. The proposed classifier exploits the and robust detection algorithms.
color and texture information to detect skin regions. This paper k-Means clustering method used by Sree et al. to enhance the
is organized as follows. In Section 2, an overview of related skin accuracy of skin detection [30]. In their algorithm, k-means cluster-
detection methods is presented. Section 3 describes the existing ing is used cluster the image into three clusters after using explicit
algorithms that form the background for the proposed method. rules. One of the three clusters contains skin regions and the other
The proposed skin detection method is described in Section 4. The two contain the background and the edges of the skin regions. How-
experimental results are reported in Section 5; finally, a research ever, the authors did not explain how their algorithm can select the
discussion is presented in Section 6, and the conclusion is summa- cluster that represents the skin area which is a major issue in their
rized in Section 7. algorithm.
Another algorithm exploited k-means clustering for skin detec-
tion was proposed by Bevilacqua et al. [31]. In their algorithm, the
2. Related work image is first converted into CIE L*a*b color space, and then the
image pixels are segmented into three clusters based on a and b
Artificial neural networks are interconnections of artificial neu- channels. Supposing that one of the clusters will contain most of
rons that, incredibly, mimic the organic neurons of the human the skin pixels, the centroids of the three clusters are used to train
brain. The main point of utilizing ANNs within skin color classi- an ANN. The objective of the ANN is to detect which cluster has the
fication is to enhance the separability between skin and non-skin probability to contain skin. This neural network has six neurons
pixels. The first attempt to use an ANN for skin detection was made at the input layer, an hidden layer of nine neurons and an out-
by Chen et al. [23]. They employed an ANN using the back propaga- put layer of three neurons. The output of the ANNis the probability
tion algorithm, utilizing the normalized RGB color space to reduce that the cluster has skin. For example, if the output of the ANN is:
the sensitivity of illumination variations. Seow et al. introduced a (0.000020), (0.024), (0.98) it means that the third cluster has skin
skin color model based on an ANN used for face detection applica- pixels. After that, the segmented image undergoes a cover-holes fil-
tions. They aimed to minimize the restriction relating to skin color ter and a skin filter based on elliptical boundary models to enhance
variations between different races [24]. They were trying to create the output. The algorithm is based on the assumption that the most
a self-adaptive color model based on a back propagation training of skin pixels will be clustered in one cluster using a and b chan-
algorithm and RGB color space. Yang et al. [25] employed a back nels. However, this is not true for all images as the images vary
propagation algorithm along with a Gaussian model classifier. They in their background and illumination conditions. Fig. 1 shows an
intended to enhance the accuracy of skin detection by using the Y example of segmenting an image using k-means clustering based
component of YCbCr color space to overcome the impact of image on ab channels. It is obvious that approximately half of the skin
illumination. area is in cluster 1 and the other half is in cluster 2. That means
Zaidan et al. [26] proposed a module that hybridized a BP algo- half of the skin area will not be detected as the ANN will select only
rithm and heuristic rule method using YCbCr color information. one cluster. As the result, the accuracy of detection will decrease
The objective of their algorithm was to increase the classification dramatically.
reliability by removing the impact of illumination. The separability
between skin and non-skin pixels relies on the output of the ANN
and heuristic rules. Kakumanu et al. [27] adopted image chromatic 3. Theoretical background
adaptation. A neural network is used in this algorithm, which com-
prises two steps: first, to investigate the chromatic adaptation of In this section, three methods, which form the basis for the pro-
the image, and second, to detect the human skin. posed method, are described, namely: (1) MLP ANN, (2) Differential
Duan et al. [28] applied the ability of Pulse Coupled NN (PCNN) Evolution (DE), and (3) k-means clustering. In addition, this section
to ease the imitation of the human vision technique by finding presents the skin color and texture information that are employed
the relationships of the adjacent pixels. Bhoyar and Kakde [29] in this paper.
H.K. Al-Mohair et al. / Applied Soft Computing 33 (2015) 337–347 339

Fig. 1. Segmentation of image using k-means clustering based on ab channels (k = 3).

3.1. MLP ANN

In this work, the MLP was implemented using the MATLAB Neu-
ral Network Toolbox. For each configuration, the training process
for each MLP structure was repeated 20 times with a different
number of neurons to find the best performance (i.e., the global
minimum) of the network error space in terms of minimum mean
square error (MSE).

3.2. Differential Evolution (DE)

DE is a simple and fast heuristic search method that simulates


the basic idea of organism evolution [32]. Using iterations, the DE
method optimizes a given problem by trying to improve a candi-
date solution with regard to a given measure of quality. It is used for
multidimensional real-valued functions without using the gradient
of the problem being optimized. In contrast to classic optimiza-
tion methods, such as gradient descent and quasi-Newton methods,
DE does not require the optimization problem to be differentiable.
Fig. 2. The DE algorithm.
Therefore, DE can also be used for optimization problems that are
discontinuous, noisy and variable over time. In addition, the DE
algorithm has less tunable parameters and has the ability to self- data set points and the nearest new centroid. During the loop, the
organize. The simplicity and ease of implementation make the DE k centroids change their location step by step, and the loop stops
algorithm very popular; this algorithm can be exploited in a wide when no more changes can be made. In addition to its simplicity,
range of applications such as digital filter design [33], shape recon- k-means works well with large datasets. If k is small, k-means may
struction [34], and digital image watermarking [35]. be computationally faster than other techniques such as hierarchi-
When DE is used to optimize a function with n real parameters, cal clustering [37]. Moreover, k-means may produce tighter clusters
the DE will generate a population of candidate solutions within than hierarchical clustering [38]. It can be employed in many fields,
predefined boundaries. All candidate solutions are tested and eval- such as medical image segmentation [39], brain tumor detection
uated trying to locate the minima of the objective function. If the [40], and content-based image retrieval [41].
optimum solution is not reached, a genetic algorithm is used to gen-
erate another population. During each iteration, called a generation, 3.4. Skin color and texture information
new candidate solutions are generated by the combination of solu-
tions randomly chosen from the current population (mutation). The Digital image segmentation and classification are good exam-
outcoming solutions are then mixed with a predetermined target ples of research areas that depend on color as a significant source of
solution. This operation is called recombination and produces the information. Nevertheless, some of the original colors are not suit-
trial solution. Finally, the trial solution is accepted for the next gen- able for certain types of image processing, and hence, adjustment of
eration if and only if it yields a reduction in the value of the objective the color space is required for image analysis. For this adjustment,
function, i.e. the stop criterion. This last operator is referred to as a colors should be transferred from one space to another to obtain
selection. Fig. 2 illustrates the steps of DE method. accurate results.
A skin detection process can be summarized in two steps:
3.3. k-Means clustering method express the image using the perfect color space and classify the skin
pixels based on the assigned skin samples using inference meth-
k-Means is a popular unsupervised learning algorithm that is ods [2]. Selecting the proper color space is essential for skin color
used in a wide range of applications, such as data mining, because of detection.
its simplicity [36]. The purpose of k-means clustering is to separate Different color spaces are compared for skin detection; compar-
n observations into k clusters so that each observation is assigned to ison results show that there is no decision about the best color space
the cluster with the nearest mean. Each cluster has a centroid, and used in skin detection. Most researchers do not provide good justifi-
the centroids should be defined and distinct from each other. After cation for selecting the appropriate color space [14] or for clarifying
defining the k centroids, the next step is to take each point belong- conflicts between the results of other authors [27]. Furthermore,
ing to a given data set and associate it with the nearest centroid. some authors depend on their personal experience more than sci-
The groupage is completed when no point is pending. Then, k new entific experiments [28]. However, most researchers indicated that
centroids are recalculated as barycenters of the clusters resulting different skin detection techniques work differently using different
from the previous step. After that, a loop is used to bind the same color spaces [2,14]. In this work, six color spaces are investigated
340 H.K. Al-Mohair et al. / Applied Soft Computing 33 (2015) 337–347

using an MLP ANN to introduce the best color space used in skin
detection. The color spaces used are RGB, YCbCr, YIQ, YDbDr, HSV,
and CIE L*a*b.

4. The proposed method

This work proposes an optimized method for human skin detec-


tion. By optimized, we mean selecting the set of input features
that can be optimally used by an MLP ANN to achieve the best
detection accuracy. Developing the proposed method involves two
phases: training an MLP ANN to detect optimized input features and
enhancing skin detection using k-means and an MLP ANN. During
the first phase, the DE algorithm is used to select optimal input vari-
ables as a combination of the color and texture information. In the
second phase, the k-means clustering method is used to enhance
the output of the optimized MLP ANN. The following sub-sections
show the process of developing our method.
Based on the fact that the skin region of an image is a group(s)
of homogenous connected pixels, texture information can be used
to describe skin regions. In this paper, six different texture descrip-
tors will be investigated: uniformity, standard deviation, skewness,
kurtosis, smoothness, and entropy. The six texture descriptors Fig. 3. Blocks extracted for training from Humanae dataset.
were chosen for their simplicity and efficiency, which makes them
suitable for real-time skin-based applications. Much more sophis-
During the process of collecting the training data, each block is
ticated texture descriptors may be exploited, such as Gabor filters,
divided into sub-blocks of 4 × 4 pixels. For each sub-block, the RGB
but they slow down the detection process. The formulas that can
channels are separated and the corresponding 4 × 4 matrices are
be used to calculate the six descriptors are given in Table 1.
converted into three 16-element vectors. After that, the sub-block
is converted into gray-level, and the six parameters; standard devi-
ation, kurtosis, . . . etc., are calculated. For each sub-block, there will
4.1. Preparing the dataset
be six values that describes the texture of that sub-block. Since the
proposed algorithm classifies pixels not blocks, the six values are
In preparation for training for an MLP ANN, three types of blocks
repeated 16 times to form six 16-elements vectors. The color infor-
(patches) are collected: pure skin blocks, pure non-skin blocks, and
mation, three 16-element vectors, and the texture information, six
mixed blocks. Table 2 illustrates the details of the training data used
16-elements vectors, are then concatenated into one 9 × 16 array.
for the training process.
The process is repeated for all sub-blocks, and the resultant data is
Referring to Table 2, Humanae is a chromatic inventory that pro-
combined to form 9 × 2,400,000 array.
vides a dataset containing a broad collection of images of different
The process of collecting the training data from the blocks is
people [43]. In addition to the diversity, the images of the Humanae
illustrated in Fig. 4. It can be seen in the resultant matrix that R, G,
dataset are of high resolution compared to the Compaq dataset.
and B values are different for each pixel, but the texture descriptors’
One hundred fifty images containing human skin were downloaded
values are the same. The process is repeated for each color space
from the “Humanae Project” webpage, and then 40 × 40 pixels sized
to obtain six different training datasets. In the same manner, for
blocks were selected manually from different areas of the image, as
the mixed blocks, the corresponding ground truth of each block is
shown in Fig. 3. The blocks were selected from the forehead, cheeks,
divided into sub-blocks, and the binary values (1 for skin and 0 for
shoulders, or chest. These locations were chosen to take into con-
non-skin) in the sub-blocks are converted into a 16-element vector.
sideration any small differences in skin color tone that may exist
This vector will be the desired output during ANN training. The two
between the different body areas of the same person.
processes are repeated for all sub-blocks and blocks, and the resul-
The mixed blocks are blocks that contain both skin and non-
tant vectors are concatenated in two arrays: input (9 × 2,400,000)
skin pixels. They were selected from images of the ECU dataset in
and output (1 × 2,400,000).
such a manner that they have almost an equal number of skin and
non-skin pixels. The selection process was completed manually;
achieving equal numbers of skin and non-skin pixels was difficult. 4.2. Metrics
However, the resultant data set contains 1,173,435 skin pixels out
of 2,400,000 pixels (equivalent to 48.49% of the dataset). Skin detection in digital images can be considered as a classifi-
The Humanae dataset was used to collect pure skin patches cation problem, where image pixels, the dataset, are classified into
while the pure non-skin patches were gathered from self-collected skin pixels and non-skin pixels. Generally, the ratio of skin pixels
images dataset. Although those two dataset have no ground truth, it to the total number of pixels in the images varies significantly. This
is easy to generate the ground truth for the patched which is either makes the image an unbalanced dataset because the number of
all 1s or all 0s. However, the proposed algorithm extracted data skin pixels does not equal the number of non-skin pixels. Measur-
from 4 × 4 sub-blocks blocks and in real situations the sub-blocks ing the detection accuracy as the total number of predictions that
cannot be only pure skin or pure non-skin. So, to simulate real sit- were correct may not be an adequate performance measure when
uation, a third type of patches which contains skin and non-skin the dataset is imbalanced [44]. Hence, we need appropriate metrics
pixels is needed. Those patches are collected from images of ECU that reflect the actual accuracy. In this paper, we adopted precision
dataset which already has ground truth images. The patches which (P), recall (R), and F1 -measure, which are commonly used measures
have been collected cover wide range of human race and color and in the field of information retrieval [45]. Precision is a measure of
different illumination conditions. the probability that a detected region is correct, while recall is the
H.K. Al-Mohair et al. / Applied Soft Computing 33 (2015) 337–347 341

Table 1
Statistical texture descriptors used [42].

Description Formula Notes

Standard deviation () A measure of average contrast. = L is the number of possible intensity

L−1
levels.
2
(zi − m) p(z1 )
i=0

L−1

Uniformity (U) The uniformity measure is maximum when all gray U= p2 (zi ) zi is a random variable indicating
levels are equal (maximally uniform) and decreases intensity.
i=0
from there.
1
Smoothness (r) Measure of the relative smoothness of the intensity in r =1− p(z) is the histogram of intensity levels
1+ 2
a region. R is 0 for a region of constant intensity and in a region.
approaches 1 for regions with large excursions in the
values of its intensity levels.
Entropy (e) A measure of randomness. e= m is the mean (average) intensity.

L−1

− p(zi )log2 p(zi )


i=0
3
E (zi −)
Skewness (S) A measure of the asymmetry of the intensity values S= E(t) represents the expected value of
3
around the mean (average) intensity. the quantity t.
E(zi −)4
Kurtosis (K) A measure of how outlier-prone a distribution is. The K=
4
kurtosis of the normal distribution is 3. Distributions
that are more outlier-prone than the normal
distribution have kurtosis greater than 3; distributions
that are less outlier-prone have kurtosis less than 3.

Fig. 4. Collecting training data from Humanae dataset blocks.


342 H.K. Al-Mohair et al. / Applied Soft Computing 33 (2015) 337–347

Table 2
Distribution of color-texture features in the training dataset.

No. of blocks Block size (pixels) Total pixels Source

Pure skin blocks 450 40 × 40 720,000 Humanae database


Pure non-skin blocks 450 40 × 40 720,000 Collected from the Internet
Mixed blocks 600 40 × 40 960,000 ECU database
Total size of database 2,400,000

ratio of True Positive components to elements inherently ranked as


the positive class. They can be calculated as follows:

true positive
P= (1)
true positive + false positive Fig. 5. The chromosome design consists of n, C1 ∼ C3 (RGB color space), and T1 ∼ T6
(texture descriptors).
true positive
R= (2)
true positive + faslse negative

Because of the trade-off between P and R, the detection accuracy


is measured using F1 -measure, which is the harmonic-mean of P During the optimization process, the color and texture param-
and R. F1 -measure is given by: eters, C1 ∼ C3 and T1 ∼ T6 , are assigned binary values 0 and 1. The
binary value 1 indicates that the parameter is included within the
2PR current combination of color-texture training data and vice versa.
F1 -measure = (3)
P+R The number of neurons in the hidden layer was investigated using
3–20 neurons. Based on Fig. 5, if the values of the current chro-
4.3. MLP ANN optimization mosome are 8 1 1 1 0 0 1 0 1 0, this indicates that the current MLP
ANN has eight neurons in its hidden layer and five neurons in the
In order to exploit the DE algorithm for optimal selection, the input layer, and the training data are R, G, B, K, and U as shown in
chromosome and the fitness function should be designed appro- Fig. 6. For each training dataset, the training process is repeated
priately. The chromosome’s design and their evaluation to obtain for every chromosome in all generations as shown in Fig. 7. The
our optimization goals are as follows: maximum number of iterations used is 100, and the stop criterion
is the minimum mean square error (MSE), which denotes the best
(1) Optimization of objective function: There are two objectives performance of the MLP ANN.
in the optimization process: (a) to determine the best combi- After completion of the training process, the DE adjusts the com-
nation of color space, C1 ∼ C3 , and texture information, T1 ∼ T6 , bination to obtain a better quality meter (smaller MSE in our case)
that produces the highest detection accuracy and (b) to deter- and performs another training process using the new input com-
mine the optimal number of neurons n in the hidden layer of bination. The iterations continue until the DE algorithm finds the
the MLP ANN. combination that yields the smallest MSE. By the end of the six opti-
(2) Chromosome design: We designed the chromosome so that all mization operations, the minimum MSE value was 4.46 when the
the parameters (i.e., n, C1 ∼ C3 , and T1 ∼ T6 ) are simultaneously YIQ color space was combined with standard deviation, kurtosis,
optimized. As shown in Fig. 5, the chromosome encapsulates uniformity, and entropy using 12 neurons. The results of optimiza-
the targeted parameters. tion step is presented in Table 3.

Fig. 6. ANN topology based on the chromosome generated by DE.


H.K. Al-Mohair et al. / Applied Soft Computing 33 (2015) 337–347 343

Table 3
Reults of optimization step: best color-texture combinations.

Color spaces

RGB YCbCr YIQ YDbDr HSV CIE L*a*b


√ √ √ √
Texture  – –
√ √ √ √
S – –
√ √
K – – – –
√ √ √
r – – –
√ √ √ √ √
U –
√ √ √ √ √
e –
Number of hidden neurons 8 8 12 10 12 12
MSE 4.78 4.66 4.46 4.57 4.51 4.61

Highlight the lowest values.

denotes non-skin and 1 denotes skin. A common procedure is to use


a threshold to convert the output of the MLP ANN to a binary image
that presents the skin and non-skin region of the image. Instead of
using a threshold, we opted to use the k-means clustering method
to convert the output of the MLP ANN to the desired form.
The output of the MLP ANN is one-dimensional data where each
pixel is represented by a single value; however, it will be better if
that single value is paired with another value to describe or repre-
sent the pixel. With two-dimensional data vectors, segmenting the
image into skin and non-skin by some clustering method, such as
the k-means clustering method, can be much more accurate. To use
the k-means clustering method, we investigated pairing the output
of the MLP ANN with the information of one of the original image
channels, such as Y, Q, Cb, and H. After testing different channels,
we found that the Q channel of the YIQ color space gave the best
results in terms of accuracy.
Clustering an image using the paired data along with the k-
means clustering method into three clusters is shown in Fig. 8.
The x-axis represents the output of the MLP ANN, and the y-axis
represents the Q channel. For each cluster, the k-means method
determines a centroid, which is the point with coordinates equal
to the average values of the variables (output value and Q value)
for the pixels in that cluster. Skin areas have a white color because
the skin pixels have values of 1 or near 1. Hence, the cluster that
Fig. 7. Parameter optimization process.
represents skin is the one that has a centroid with the highest x-
coordinates. Coordinates of the clusters’ centroids for the image
4.4. Enhancing the output of the MLP ANN whose results are in Fig. 9 are listed in Table 3.
As shown in Table 4, the cluster that represents the skin area has
From the output of the optimized MLP ANN, we can see that the the highest x coordinates (0.966). This fact can be used to identify
image can be segmented into three areas: skin areas (white), non- the skin area after clustering pixels using the k-means clustering
skin areas (black or dark), and skin-like areas (between white and technique.
black) as shown in Fig. 8. This segmentation is because the output of This is the advantage of combining the output image of the MLP
the MLP ANN is not binary numbers but ranges from 0 to 1, where 0 ANN with the Q channel of the image. If the k-means clustering

Fig. 8. Three areas in the output of the MLP ANN: (1) skin, (2) non-skin, (3) skin-like.
344 H.K. Al-Mohair et al. / Applied Soft Computing 33 (2015) 337–347

Fig. 9. Clustering the data into three clusters.

Fig. 10. Proposed skin detection method.


H.K. Al-Mohair et al. / Applied Soft Computing 33 (2015) 337–347 345

Fig. 11. Results of skin detection: (a) original image, (b) ground truth, (c) proposed detected skin (using MLP ANN with k-means), (d) detected skin (using MLP ANN with a
dynamic threshold).

technique is used to segment the image into skin and non-skin Table 4
The three clusters and their centroids.
using any two channels, such as Y–Q or Y–I, the cluster that
corresponds to the skin area cannot be identified automatically. Cluster Coordinates Area type
With the proposed algorithm, identifying the cluster corresponding
x (output of MLP ANN) y (Q channel)
to the skin area becomes direct and explicit.
1 (Blue) 0.019 0.569 Non-skin
The proposed algorithm can be summarized as follows:
2 (Green) 0.023 0.453 Non-skin
3 (Red) 0.966 0.647 Skin

Phase A: Image dataset training and optimized feature extraction


(1) Different color spaces and texture descriptors are studied and
investigated. (2) For each block, the texture descriptors (standard deviation,
(2) DE algorithm is used to select the best combination of color- kurtosis, uniformity, and entropy) are extracted and combined
texture information that can be used as input to the MLP ANN. with YIQ color information to form the input.
Phase B: Skin detection (3) The input data are applied to the optimized MLP ANN that was
(1) The image is converted into YIQ color spaces and then divided obtained in Phase A, and an output image O is obtained. O has
into 4 × 4 blocks. the range (0–1).
346 H.K. Al-Mohair et al. / Applied Soft Computing 33 (2015) 337–347

Table 5
Results of skin detection using different methods.

Method Recall (R, %) Precession (P, %) F1 (%)

The proposed method (MLP ANN + k-means clustering) 88.00 87.65 87.82
MLP ANN + dynamic threshold [35] 85.51 79.32 82.30
Explicit rules (YCbCr + YUV) [9] 91.16 41.71 57.24
Explicit rules (RGB + YUV) [9] 89.62 59.12 71.25
ANN (RGB + texture) [17] 81.26 77.31 79.23
Bayesian network (YCbCr) [48] 62.87 41.28 49.83
Bayesian network (YIQ + texture) 85.79 50.82 63.83

Highlight the highest value.

(4) Then, O is combined with the Q channel that was obtained in 6. Discussion
step 1 to form the 2-dimensional data (O–Q).
(5) k-Means clustering technique is used to cluster OQ data into From the table below, we observe that the proposed method has
three clusters C, and each cluster has its centroid; the coordi- the highest accuracy, 87.82%, in terms of F1 -measure. The main con-
nates of the centroids Cx and Cy represent the O (x-axis) and tributor to that accuracy is the optimized MLP ANN used. Even when
Q (y-axis) correspondingly. The k-means technique labels the only a simple dynamic threshold was used, the MLP ANN achieved
pixels in O according to the number of clusters to which they an accuracy of 82.30%, which is still higher than the other methods
belong as follows: we used for comparison. The experimental results confirmed the
superior performance of ANN over the other methods. Although the
O (Ci ) = i, i = 1, 2, 3 (4) method in Ref. [17] is based on ANN and color-texture descriptors,
it achieved only 79.23% because they used different color space and
(6) The number of the cluster corresponding to skin area, s, can be slightly different texture information (standard deviation, entropy,
identified according to the following equation: and range). This confirms the fact that color spaces have different
abilities to represent human skin. In addition, color information
s=i if max (Cx) = Cxi , i = 1, 2, 3 (5) only is not enough for accurate skin detection.
The high performance of the ANN-based methods, in terms of
(7) Then, the non-skin area in the labeled image O is omitted using F1 -measure, is because of their ability to achieve a high and bal-
the following equation: anced recall and precision. In Table 5, the three ANN-based methods
⎧ have recall and precision values of (88, 87.65), (85.51, 79.32), and
⎨ 1 if i = s (81.26, 77.31). On the contrary, the methods with explicit rules can
O (Ci ) = , i = 1, 2, 3 (6) achieve high recall values, 91.16 and 89.62, but low precision val-

0 if i =
/ s ues, 41.71 and 59.12. Such unbalanced recall-precision values make
the accuracy quite low. The same thing can be said about Bayesian
The resultant image is a binary image where the skin area Network-based methods.
corresponds to 1 values and the non-skin area corresponds to The proposed method achieved very good performance; how-
0 values. ever, illumination conditions and objects or backgrounds with
(8) Morphological operations (opening and closing) are used to skin-like color are still a challenge. Poor illumination quality in dark
overcome holes and the isolated pixels that are outliers (skin images makes skin pixels look darker and result in misclassification
pixels mistakenly detected as non-skin and vice versa). as non-skin pixels, and this decreases the recall values. However,
(9) Finally, skin areas with sizes less than 5% of the largest skin area background pixels and objects with skin-like color are misclassified
in the image are neglected. Those areas are mostly incorrectly as skin, and this decreases resection values.
detected as skin and cannot be omitted using the morpholog-
ical operations. 7. Conclusion

The proposed algorithm is illustrated in Fig. 10. In this paper, a novel human skin detection method is proposed.
The proposed method is a hybrid method that combines the advan-
tages of two clustering techniques: neural networks and k-means.
5. Experimental results In addition, different combinations of color-texture descriptors
were investigated to determine the optimal descriptor among the
The experiments were carried out using images from the ECU possible combinations. A powerful optimization method, DE, was
database. ECU images were used because the images are acquired used in order to determine the optimal combination that can be
using uncontrolled lighting conditions, and objects with a skin-like used for accurate skin detection. The experimental results show
color often appear in the background, which makes skin detection that the proposed method can achieve a high accuracy, with an
a difficult process [46]. The ECU images are provided with ground F1 -measure of 87.82%, based on images from the ECU database.
truth skin binary masks that indicated the skin regions. For future work, we suggest employing a dynamic lighting
The proposed algorithm achieved an accuracy of F1 = 87.82% correction algorithm in order to overcome poor illumination con-
compared to the accuracy of F1 = 82.3% that can be achieved when ditions. To overcome skin-like pixels, more texture descriptors
the MLP ANN is used along with a dynamic threshold as discussed may be investigated to find texture information that can represent
in our previous work [47]. Fig. 11 illustrates a comparison between human skin accurately.
the proposed algorithm (MLP ANN with k-means clustering) and
the MLP ANN with a dynamic threshold. Acknowledgment
The proposed algorithm was compared with three different
algorithms based on explicit rules [9], ANN along with RGB and The authors would like to acknowledge Universiti Sains
texture information [17], and Bayesian network with YCbCr color Malaysia Research Grant Individual (USM-RUI) with No:
space [48]. The comparison is presented in Table 4. 1001/PELECT/814092 for the financial support.
H.K. Al-Mohair et al. / Applied Soft Computing 33 (2015) 337–347 347

References [25] G. Yang, H. Li, L. Zhang, Y. Cao, Research on a skin color detection algorithm
based on self-adaptive skin color model, in: 2010 International Conference
[1] C.C. Liu, P.C. Chung, Objects extraction algorithm of color image using adaptive on Communications and Intelligence Information Security, 2010, pp. 266–
forecasting filters created automatically, Int. J. Innov. Comput. Inf. Control 7 270.
(10) (2011) 5771–5787. [26] A.A. Zaidan, N.N. Ahmad, H.A. Karim, G.M. Alam, B.B. Zaidan, Increase
[2] P. Kakumanu, S. Makrogiannis, N. Bourbakis, A survey of skin-color modeling reliability for skin detector using backprobgation neural network and
and detection methods, Pattern Recognit. 40 (3 (March)) (2007) 1106–1122. heuristic rules based on YCbCr, Sci. Res. Essays 5 (19) (2010) 2931–
[3] C. Zhipeng, Face detection system based on skin color model, in: 2010 Int. Conf. 2946.
Netw. Digit. Soc., May, 2010, pp. 664–667. [27] P. Kakumanu, S. Makrogiannis, R. Bryll, S. Panchanathan, N. Bourbakis, Image
[4] Z. Zhang, H. Gunes, M. Piccardi, Head detection for video surveillance based chromatic adaptation using ANNs for skin color adaptation, in: 16th IEEE
on categorical hair and skin colour models, in: 2009 16th IEEE International International Conference on Tools with Artificial Intelligence, ICTAI, 2004, pp.
Conference on Image Processing (ICIP), 2009, pp. 1137–1140. 478–485.
[5] J.S. Lee, Y.M. Kuo, P.C. Chung, Detecting nakedness in color images, in: H.T. [28] L. Duan, Z. Lin, J. Miao, Y. Qiao, A method of human skin region detection based
Sencar, S. Velastin, N. Nikolaidis, S. Lian (Eds.), Studies in Computational Intel- on PCNN, Lect. Notes Comput. Sci. 5553 (2009) 486–493.
ligence, vol. 282, Springer, Berlin, Heidelberg, 2010, pp. 225–236. [29] K.K. Bhoyar, O.G. Kakde, Skin color detection model using neural networks and
[6] J. Han, G. Awad, A. Sutherland, Automatic skin segmentation and tracking in its performance evaluation, J. Comput. Sci. 6 (9) (2010) 955–960.
sign language recognition, IET Comput. Vis. 3 (1) (2009) 24. [30] P.K. Sree, I.R. Babu, Face detection from still and video images using unsuper-
[7] H.K. Al-Mohair, J. Mohamad-Saleh, S.A. Suandi, Human skin color detection: a vised cellular automata with K means clustering algorithm, ICGST J. Graph. Vis.
review on neural network perspective, Int. J. Innov. Comput. Inf. Control 8 (12) Image Process. 8 (2) (2008) 1–7.
(2012) 8115–8131. [31] V. Bevilacqua, G. Filograno, G. Mastronardi, Face detection by means of skin
[8] M. Abdullah-Al-Wadud, M. Shoyaib, O. Chae, A skin detection approach based detection, in: Fourth International Conference on Intelligent Computing, ICIC,
on color distance map, EURASIP J. Adv. Signal Process. 2008 (1) (2008) 814283. 2008, pp. 1210–1220.
[9] F. Xiang, Fusion of multi color space for human skin region segmentation, Int. [32] P. Rocca, G. Oliveri, A. Massa, Differential evolution as applied to electromag-
J. Inf. Electron. Eng. 3 (2) (2013) 172–174. netics, IEEE Antennas Propag. Mag. 53 (February) (2011) 38–49.
[10] H.K. Almohair, A.R. Ramli, A.M. Elsadig, S.J. Hashim, Skin detection in luminance [33] S.-T. Pan, B.-Y. Tsai, C.-S. Yang, Differential evolution algorithm on robust IIR
images using threshold technique, Int. J. Comput. Internet Manag. 15 (1) (2007) filter design and implementation, in: Eighth International Conference on Intel-
25–32. ligent Systems Design and Applications, 2008, pp. 537–542.
[11] Son Lam Phung, Abdesselam Bouzerdoum, Douglas Chai, A novel skin color [34] I.T. Rekanos, Shape reconstruction of a perfectly conducting scatterer using
model in YCBCR color space and its application to human face detection IEEE differential evolution and particle swarm optimization, IEEE Trans. Geosci.
International Conference on Image Processing, vol. 1, 2002. Remote Sens. 46 (7 (July)) (2008) 1967–1974.
[12] M.F. Hossain, M. Shamsi, M.R. Alsharif, R.A. Zoroofi, K. Yamashita, Automatic [35] V. Aslantas, An optimal robust digital image watermarking based on SVD
facial skin detection using Gaussian Mixture Model under varying illumination, using differential evolution algorithm, Opt. Commun. 282 (5 (March)) (2009)
Int. J. Innov. Comput. Inf. Control 8 (2) (2012) 1135–1144. 769–777.
[13] B. Kwolek, Face tracking system based on color, stereovision and elliptical shape [36] K. Wagstaff, C. Cardie, Constrained K-means clustering with background knowl-
features, in: 2003 Proceedings IEEE Conference on Advanced Video and Signal edge, in: The Eighteenth International Conference on Machine Learning, 2001,
Based Surveillance, 2003, pp. 21–26. pp. 577–584.
[14] V.S.A. Vladimir Vezhnevets, Andreeva, A survey on pixel-based skin color detec- [37] A. Halder, Color image segmentation using rough set based K-means algorithm,
tion techniques Proc. Graph., vol. 15, 2003, pp. 85–92. Int. J. Comput. Appl. 57 (12) (2012) 32–38.
[15] S. Khan, D. Bailey, G. Sen Gupta, S. Demidenko, Adaptive classifier for robust [38] D. Sonagara, S. Badheka, Comparison of basic clustering algorithms, Int. J. Com-
detection of signing articulators based on skin colour, in: 2011 Sixth IEEE put. Sci. Mob. Comput. 3 (10) (2014) 58–61.
International Symposium on Electronic Design, Test and Application, 2011, pp. [39] H.P. Ng, S.H. Ong, K.W.C. Foong, P.S. Goh, W.L. Nowinski, Medical image
259–262. segmentation using k-means clustering and improved watershed algorithm
[16] C.A. Doukim, J.A. Dargham, A. Chekima, S. Omatu, Combining neural networks Proceedings of the IEEE Southwest Symposium on Image Analysis and Inter-
for skin detection, Signal Image Process. Int. J. 1 (2) (2010) 1–11. pretation, vol. 2006, 2006, pp. 61–65.
[17] A.Y. Taqa, H.A. Jalab, Increasing the reliability of skin detectors, Sci. Res. Essays [40] M.-N. Wu, C.-C. Lin, C.-C. Chang, Brain tumor detection using color-based K-
5 (17) (2010) 2480–2490. means clustering segmentation, Third International Conference on Intelligent
[18] A. Abadpour, S. Kasaei, Comprehensive Evaluation of the Pixel-Based Skin Information Hiding and Multimedia Signal Processing (IIH-MSP 2007) 2 (2007)
Detection Approach for Pornography Filtering in the Internet Resources, 1996, 245–250.
pp. 1–6. [41] Z.S. Younus, D. Mohamad, T. Saba, M.H. Alkawaz, A. Rehman, M. Al-Rodhaan, A.
[19] B.D. Zarit, B.J. Super, F.K.H. Quek, Comparison of five color models in skin pixel Al-Dhelaan, Content-based image retrieval using PSO and k-means clustering
classification, in: Proc. Int. Work. Recognition, Anal. Track. Faces Gestures Real- algorithm, Arab. J. Geosci. (August) (2014) 1–14.
Time Syst. Conjunction with ICCV’99 (Cat. No. PR00378), 58–63. [42] R.C. Gonzalez, R.E. Woods, S.L. Eddins, Digital Image Processing Using Matlab,
[20] J.-C. Terrillon, M.N. Shirazi, H. Fukamachi, S. Akamatsu, Comparative perfor- 2nd ed., Gatesmark Publishing, USA, 2009.
mance of different skin chrominance models and chrominance spaces for the [43] Humanæ (2014) (Online). Available: http://humanae.tumblr.com/
automatic detection of human faces in color images, in: Fourth IEEE Inter- [44] N.V. Chawla, Data mining for imbalanced datasets: an overview, in: O.Z. Mai-
national Conference on Automatic Face and Gesture Recognition, 2000, pp. mon, L. Rokach (Eds.), Data Mining and Knowledge Discovery Handbook,
54–61. Springer, US, 2005, pp. 853–867.
[21] G. Gomez, M. Morelos, On selecting colour components for skin detection, in: [45] C.D. Manning, P. Raghavan, H. Schutze, Introduction to Information Retrieval,
16th International Conference on Pattern Recognition, 2002, pp. 961–964. Cambridge University Press, Cambridge, 2009.
[22] D. Kuiaski, H.V. Neto, G. Borba, H. Gamba, A study of the effect of illumination [46] S.L. Phung, A. Bouzerdoum, D. Chai, Skin segmentation using color pixel classi-
conditions and color spaces on skin segmentation, in: XXII Brazilian Sympo- fication: analysis and comparison, IEEE Trans. Pattern Anal. Mach. Intell. 27 (1
sium on Computer Graphics and Image Processing, 2009, pp. 245–252. (January)) (2005) 148–154.
[23] L. Chen, J. Zhou, Z. Liu, W. Chen, G. Xiong, A skin detector based on neural [47] H.K. Al-Mohair, J. Mohamad-Saleh, S.A. Suandi, Color space selection for human
network, IEEE 2002 International Conference on Communications, Circuits and skin detection using color-texture features and neural networks, in: 2014 Int.
Systems and West Sino Expositions, vol. 1, 2002, pp. 615–619. Conf. Comput. Inf. Sci., June, 2014, pp. 1–6.
[24] D. Valaparla, V.K. Asari, Neural network based skin color model for face detec- [48] A. Rasim, T. Alexander, Hand detection based on skin color segmentation and
tion, in: Proceedings of 32nd Appl. Imag. Pattern Recognit. Work, 2003, 2003, classification of image local features, TEM J. 2 (3) (2013) 150–155.
pp. 141–145.

You might also like