COLOUR SPACES STUDY FOR SKIN COLOUR DETECTION IN
FACE RECOGNITION SYSTEMS
Jose M. Chaves-González, Miguel A. Vega-Rodríguez
Juan A. Gómez-Pulido and Juan M. Sánchez-Pérez
Univ. Extremadura, Dept. Technologies of Computers and Communications,
Escuela Politécnica, Campus Universitario s/n, 10071, Cáceres,Spain
Keywords: Face detection, colour spaces, YUV, YIQ, RGB, YCbCr, HSV, skin colour detection.
Abstract: In this paper we show the results of a work where a comparison among different colour spaces is done in
order to know which one is better for human skin colour detection in face detection systems. Our motivation
to do this study is that there is not a common opinion about which colour space is the best choice to find
skin colour in an image. This is important because most of face detectors use skin colour to detect the face
in a picture or a video. We have done a study using 10 different colour spaces (RGB, CMY, YUV, YIQ,
YCbCr, YPbPr, YCgCr, YDbDr, HSV –or HSI– and CIE-XYZ). To make the comparisons we have used
truth images of 15 different people, comparing at pixel level the number of correct detections (false
negatives and false positives) for each colour space.
1 INTRODUCTION
The automatic processing of images containing faces
is essential in lots of fields in our days. One of the
most successful and important applications are face
recognition systems (Zhao, 2003). The first stage in
every face recognition system consists in the
detection of the face from the image where it is
included. There are several methods for finding a
face in an image (Yang, 2002). In this study we are
going to focus on a method classified into feature
invariant approaches. This method consists in
finding human skin colour to detect faces in pictures.
In the bibliography there are many colour spaces
that have been used to label pixels as skin. For
example, there are studies using RGB (Naseem,
2005), HSV (Sigal, 2004), YCbCr (Jinfeng, 2004),
YPbPr (Campadelli, 2005), YUV (Runsheng, 2006),
YIQ (Jinfeng, 2004), etc. and even combinations of
various colour spaces to solve the same problem
together (Jinfeng, 2004).
The motivation for our study is that there is not a
common criterion about which colour space is the
best to find skin colour in an image. It is true that
there are other studies that compare the behaviour of
different colour spaces finding skin (Albiol, 2001),
(Phung, 2005), but the approach of our study is
completely different. We carry out a detailed study
of how good are different colour spaces using the
same parameters (the same images and the same
method to find skin colour) and with high precision
in comparisons, because we do the study at pixel
level, using truth images. So, after the study we can
state which colour space is better, and in what
quantity, comparing the results of how many skin
pixels we could identify using that colour format.
To explain our study, this paper has been
organized as follows: section 2 provides a detailed
description of the experiments performed. Results
obtained are expounded and analysed in section 3.
Finally, conclusions are explained in section 4.
2 EXPERIMENTS
We have studied in depth 10 different colour spaces
(RGB, CMY, YUV, YIQ, YPbPr, YCbCr, YCgCr,
YDbDr, HSV and CIE-XYZ) using 15 images, of
different people, which are included in AR face
database (Martinez, 1998).
To determine which colour space detects better
the human skin colour we have generated the truth
images of the 15 pictures used in the experiments. In
these images, the parts of the photo which do not
include skin colour are removed. So, in the truth
175
M. Chaves-González J., A. Vega-Rodríguez M., A. Gómez-Pulido J. and M. Sánchez-Pérez J. (2007).
COLOUR SPACES STUDY FOR SKIN COLOUR DETECTION IN FACE RECOGNITION SYSTEMS.
In Proceedings of the Second International Conference on Signal Processing and Multimedia Applications, pages 171-174
DOI: 10.5220/0002136601710174
Copyright
c
SciTePress
images there are no hair, no beard, no lips, no eyes,
no background, etc. Figure 1 shows two of the real
images used in the study and the truth images for
those pictures. For doing the classification, we have
developed a K-Means classifier (Shapiro, 2001) with
some improvements.
Figure 1: Example of two of the images (and the truth
images for each) used in the study.
K-Means method is a clustering algorithm which
groups into K classes the data to classify. In our
case, these data are the pixels of the image. It is
necessary to establish the value for K before the
execution of the algorithm. We did some tests with
different K values (K=2, K=3, K=4, K=7), as we can
observe in figure 2, but at the end we decided to use
K=3 in our study because we wanted a balance
between performance and results quality. K-Means
is a very efficient algorithm, but when K is
increased, convergence time increases too.
Figure 2: Results obtained with K-Means algorithm used
over RGB colour space when K = 2; K = 3; K = 4 and K =
7 (from left to right).
Moreover, we think that 3 classes are a very
sensible choice for the type of images that we
manage in our study (face recognition typical images
with a constant background). In this case one class is
associated to skin colour, another class groups the
darkest parts of the image (which are mainly hair,
beard, eyebrows…) and a third class is associated
with the brightest parts in the image (such as the
background and maybe some highlights in some
parts of the skin).
For each colour space we have done tests with
each channel alone, using two channels together
(with all different combinations) and with the three
channels of the colour space to discover which
combination gives us the best results. As we said in
the introduction section, we do the study at level
pixel. Once K-Means method provides us the
classification for pixels in a concrete colour space,
we compare this classification with the truth image.
The comparison is done for each pixel of the
obtained result and the same pixel in the truth image.
If the result obtained by the classifier coincides with
what the truth image says for that pixel, we have a
right detection, if our classier says that there is skin
in a pixel where there is not, we have a false positive
and finally if our classifier says that there is no skin
in a pixel where really there is, we have a false
negative.
Table 1 shows the right detection results for each
colour space. As we can see, we have focused on the
three channels of the colour spaces separately and in
the three channels all together. When we use the
three channels all together we have considered what
most channels say (e.g.: if two of the three channels
say that there is a hit in a concrete pixel, the final
result is a hit).
3 RESULTS ANALYSIS
In this section we analyse and explain the results
obtained in each colour space for skin detection. To
obtain some theoretical support about the different
colour models which are used, see biographical
references (Shapiro, 2001) and (Pratt, 2001). Due to
the number of pages of this paper, we are forced to
summarise the results obtained in our study through
table 1 (right detections rate).
3.1 RGB Model
This colour model is not very robust when there are
changes in the illumination of the images. This
explains why the channel which had more hits was
the G channel when obviously the more important
channel to find skin colour is R. The worse results in
this model were naturally obtained by B channel.
We can conclude that this colour space is not the
most appropriate one to find skin colour in an image,
although it is possible to use it with success if the
environment (illumination) is constant.
3.2 CMY Model
This colour space obtains the worse results in the
study. This fact is quite reasonable because its usage
is specified for other fields (printing more than
processing). Taking into account that this colour
space is quite similar to RGB, it is normal that the
best channel was M (because in RGB was G), just
for the same reasons.
SIGMAP 2007 - International Conference on Signal Processing and Multimedia Applications
176
Table 1: Right detections for each colour space using the three channels of the space separately and the three together.
RGB CMY YUV YIQ YCbCr YPbPr YCgCr YDbDr HSV XYZ
C1
82.79% 82.86% 84.69% 84.69% 87.06% 84.69% 88.95% 88.95% 72.16% 87.44%
C2
87.23% 86.6% 87.86% 89.93% 87.86% 87.85% 87.48% 90.58% 94.04% 85.12%
C3
79.75% 82.33% 90.19% 70.32% 90.19% 90.19% 93.19% 91.42% 82.08% 74.1%
C1C2C3
86.55% 86.15% 89.7% 86.93% 89.8% 89.7% 92.63% 92.29% 95.06% 86.27%
3.3 YUV Model
The third channel of this colour space gives us quite
good results (90.19% of right detections). V channel
saves the information for the difference between the
red component of the colour and the luminance. In
this colour space, like in most of the followings,
luminance information is separated from
chrominance information, so the results obtained are
better than for RGB model because they are more
robust to brightness variations.
3.4 YIQ Model
Our conclusion for this colour space is that although
it obtains acceptable results, YUV model, which is
very similar, obtains better results. Therefore,
according to our study, it is better to use YUV
colour space than YIQ in skin detection.
3.5 YCbCr Model
This colour model is very similar to YUV. We can
state that this colour space obtains quite good results
in general for skin colour detection, specially the Cr
channel (90.19% of right detection) that saves the
information for the red component of the colour
model.
3.6 YPbPr Model
This colour space is the analog version of YCbCr
model. So, this colour model also gets the best result
for the channel Pr. We can conclude that it is
equivalent to use YPbPr model and YCbCr model,
but this last one is more used for skin detection than
YPbPr.
3.7 YCgCr Model
This colour space is a variation of typical YCbCr
model, since it uses Cg channel instead of using Cb
channel (de Dios, 2004). In this colour space, using
the three channels together, there is a great profit
(from 89.8% of YCbCr to 92.63% of YCgCr) caused
because green colour is quite better than blue colour
to detect skin colour. In fact, the global right
detections of this colour space are one of the highest
in the study (only beaten by HSV colour space).
3.8 YDbDr Model
This colour model is quite similar to the previous
ones, but it gives better results for blue channel than
the other colour models (90.58% of right detections
–the other colour spaces do not exceed 87.8% in any
case–) without loosing precision in the other
channels (specially in Dr channel, which is very
important because Red is the channel that provides
better results in skin detection). For this reason,
YDbDr colour format provides quite good results
studying the channels separately and also studying
the three channels together (92.29% of hits).
3.9 HSV Model
This colour model provides the best results in our
study. The right detection rate of this colour format
using the three channels together is 95.06%. The
best channel alone is S, which refers to saturation in
the image, which provides an average of 94.04% of
right detection rate. However, H component has a
quite low success rate (only 72.16%) for the same
reason that RGB had a quite low success rate for R
channel: some skin parts of some faces in the
database are confused with the background when the
face has some brightness. Figure 3 shows an
example of a picture of the facial database used in
our study and the output provided by the K-Means
classifier for each of the components of HSV colour
space. Skin colour detections by the classifier are
coloured in pink colour.
Figure 3: From left to right: original face, K-means
classifier output for H channel, K-means classifier output
for S channel and K-means classifier output for V channel.
COLOUR SPACES STUDY FOR SKIN COLOUR DETECTION IN FACE RECOGNITION SYSTEMS
177
3.10 CIE-XYZ Model
The results obtained with this colour space are quite
similar to the obtained using RGB. However we can
notice that XYZ provides better results for channel
R (X in this case –87.44%–) than RGB format
(82.79%). So, we can say that XYZ colour space is a
bit more robust to illumination conditions than RGB.
4 CONCLUSIONS
We have done a complete study at pixel level of 10
different colour spaces using typical images in face
recognition systems. The purpose of this study was
to perform an objective comparison among the most
used colour spaces in skin detection to discover
which colour model provides the best results. We
can group the different colour spaces into 4 different
families: RGB family (RGB and CMY), YUV
family (YUV, YIQ, YCbCr, YPbPr, YCgCr,
YDbDr), HSV family and CIE family. According to
the obtained results, the most appropriate family for
skin detection is HSV (because HSV colour format
is the winner in our study). However, there is a
component in all colour models which, in general,
provides constant and positive results. This
component is Red component (the more significant
channel for skin detection).
We can also state that luminance channel (in
colour spaces where it is separated from
chrominance. –Y in almost all channels and V in
HSV–) is not a very important channel in skin colour
detection. In fact, we can also say that colour spaces
where luminance and chrominance are separated get
better results (RGB, CMY and XYZ colour spaces
have the lowest right detection rates of all models).
All in all, the 10 colour spaces that we have
studied provide quite good results in skin colour
detection. In general, all colour models have reduced
false positives and false negatives rates (peaks
values are explained by some unlucky highlights in
the face of some people), and the right detection
rates are at least over 86% in all colour spaces, so
we can conclude that all the models can be used for
skin colour detection with more or less success and
precision (this explains why in the bibliography
there are studies using such amount of different
colour spaces).
To sum up, after doing the quantitative study
described in this paper, we can conclude that HSV
colour space is the model which gets the best results
for skin colour detection. On the other hand, there
are colour spaces that obtain quite poor results, such
as CMY, CIE-XYZ, YIQ or even RGB. In any case,
it is possible to use almost any colour space to find
skin colour because with the appropriate classifier
and some pre-processing in the images (such as
giving higher values to contrast) most colour spaces
have quite high right detections rates.
ACKNOWLEDGEMENTS
This work has been developed in part thanks to the
OPLINK project (TIN2005-08818-C04-03). José M.
Chaves-González is supported by research grant
PRE06003 from Junta de Extremadura (Spain).
REFERENCES
Albiol, A., Torres, L., et al. Optimum Color Spaces for
Skin Detection. IEEE International Conference on
Image Processing. vol 1, pp: 122-124, Oct. 2001.
Campadelli, P., Lanzarotti, R., et al. Face and facial
feature localization. Int. Conf. on Image Analysis and
Processing. vol 1, pp: 1002-1009, Oct. 2005.
de Dios, J.J., Garcia, N. Fast face segmentation in
component color space. International Conference on
Image Processing. vol. 1, pp: 191-194. Oct. 2004.
Jinfeng Y., Zhouyu F., et al. Adaptive skin detection using
multiple cues. International Conference on Image
Processing. vol. 2, pp: 901-904. Oct. 2004.
Martinez, A.M., Benavente, R. The AR Face Database.
CVC Technical Report #24, June 1998.
Naseem, I., Deriche, M. Robust human face detection in
complex color images. IEEE International Conference
on Image Processing. vol. 2, pp: 338-341, Sept. 2005.
Phung, S.L., Bouzerdoum, A., et al. Skin segmentation
using color pixel classification: analysis and
comparison. IEEE Transactions on PAMI. vol. 27, No.
1, pp.: 148-154, Jan. 2005.
Pratt, W.K. Digital Image Processing: Pinks Inside. 3rd.
edition. John Wiley & Sons, 2001.
Runsheng J., Bin K., et al. Color Edge Detection Based on
YUV Space and Minimal Spanning Tree. IEEE
International Conference on Information Acquisition.
vol. 1, pp: 941 – 945, Aug. 2006.
Shapiro, L.G., Stockman, G.C. Computer Vision. Prentice-
Hall, 2001.
Sigal, L., Sclaroff, S., et al. Skin color-based video
segmentation under time-varying illumination. IEEE
Transactions on Pattern Analysis and Machine
Intelligence. vol. 26, No. 7, pp: 862 – 877, July 2004.
Yang, M-H., Kriegman, D.J., et al. Detecting Faces in
Images: A Survey. IEEE Transactions on PAMI, Vol.
24, No. 1, pp: 34-58, Jan. 2002.
Zhao, W., Chellappa, R., et al. Face Recognition: A
literature survey. ACM Computing Surveys, Vol. 35,
No. 4, pp: 399-458, Dec. 2003.
SIGMAP 2007 - International Conference on Signal Processing and Multimedia Applications
178