COLOUR SPACES STUDY FOR SKIN COLOUR DETECTION IN

FACE RECOGNITION SYSTEMS

Jose M. Chaves-González, Miguel A. Vega-Rodríguez

Juan A. Gómez-Pulido and Juan M. Sánchez-Pérez

Univ. Extremadura, Dept. Technologies of Computers and Communications,

Escuela Politécnica, Campus Universitario s/n, 10071, Cáceres,Spain

Keywords: Face detection, colour spaces, YUV, YIQ, RGB, YCbCr, HSV, skin colour detection.

Abstract: In this paper we show the results of a work where a comparison among different colour spaces is done in

order to know which one is better for human skin colour detection in face detection systems. Our motivation

to do this study is that there is not a common opinion about which colour space is the best choice to find

skin colour in an image. This is important because most of face detectors use skin colour to detect the face

in a picture or a video. We have done a study using 10 different colour spaces (RGB, CMY, YUV, YIQ,

YCbCr, YPbPr, YCgCr, YDbDr, HSV –or HSI– and CIE-XYZ). To make the comparisons we have used

truth images of 15 different people, comparing at pixel level the number of correct detections (false

negatives and false positives) for each colour space.

1 INTRODUCTION

The automatic processing of images containing faces

is essential in lots of fields in our days. One of the

most successful and important applications are face

recognition systems (Zhao, 2003). The first stage in

every face recognition system consists in the

detection of the face from the image where it is

included. There are several methods for finding a

face in an image (Yang, 2002). In this study we are

going to focus on a method classified into feature

invariant approaches. This method consists in

finding human skin colour to detect faces in pictures.

In the bibliography there are many colour spaces

that have been used to label pixels as skin. For

example, there are studies using RGB (Naseem,

2005), HSV (Sigal, 2004), YCbCr (Jinfeng, 2004),

YPbPr (Campadelli, 2005), YUV (Runsheng, 2006),

YIQ (Jinfeng, 2004), etc. and even combinations of

various colour spaces to solve the same problem

together (Jinfeng, 2004).

The motivation for our study is that there is not a

common criterion about which colour space is the

best to find skin colour in an image. It is true that

there are other studies that compare the behaviour of

different colour spaces finding skin (Albiol, 2001),

(Phung, 2005), but the approach of our study is

completely different. We carry out a detailed study

of how good are different colour spaces using the

same parameters (the same images and the same

method to find skin colour) and with high precision

in comparisons, because we do the study at pixel

level, using truth images. So, after the study we can

state which colour space is better, and in what

quantity, comparing the results of how many skin

pixels we could identify using that colour format.

To explain our study, this paper has been

organized as follows: section 2 provides a detailed

description of the experiments performed. Results

obtained are expounded and analysed in section 3.

Finally, conclusions are explained in section 4.

2 EXPERIMENTS

We have studied in depth 10 different colour spaces

(RGB, CMY, YUV, YIQ, YPbPr, YCbCr, YCgCr,

YDbDr, HSV and CIE-XYZ) using 15 images, of

different people, which are included in AR face

database (Martinez, 1998).

To determine which colour space detects better

the human skin colour we have generated the truth

images of the 15 pictures used in the experiments. In

these images, the parts of the photo which do not

include skin colour are removed. So, in the truth

175

M. Chaves-González J., A. Vega-Rodríguez M., A. Gómez-Pulido J. and M. Sánchez-Pérez J. (2007).

COLOUR SPACES STUDY FOR SKIN COLOUR DETECTION IN FACE RECOGNITION SYSTEMS.

In Proceedings of the Second International Conference on Signal Processing and Multimedia Applications, pages 171-174

DOI: 10.5220/0002136601710174

 SciTePress

images there are no hair, no beard, no lips, no eyes,

no background, etc. Figure 1 shows two of the real

images used in the study and the truth images for

those pictures. For doing the classification, we have

developed a K-Means classifier (Shapiro, 2001) with

some improvements.

Figure 1: Example of two of the images (and the truth

images for each) used in the study.

K-Means method is a clustering algorithm which

groups into K classes the data to classify. In our

case, these data are the pixels of the image. It is

necessary to establish the value for K before the

execution of the algorithm. We did some tests with

different K values (K=2, K=3, K=4, K=7), as we can

observe in figure 2, but at the end we decided to use

K=3 in our study because we wanted a balance

between performance and results quality. K-Means

is a very efficient algorithm, but when K is

increased, convergence time increases too.

Figure 2: Results obtained with K-Means algorithm used

over RGB colour space when K = 2; K = 3; K = 4 and K =

7 (from left to right).

Moreover, we think that 3 classes are a very

sensible choice for the type of images that we

manage in our study (face recognition typical images

with a constant background). In this case one class is

associated to skin colour, another class groups the

darkest parts of the image (which are mainly hair,

beard, eyebrows…) and a third class is associated

with the brightest parts in the image (such as the

background and maybe some highlights in some

parts of the skin).

For each colour space we have done tests with

each channel alone, using two channels together

(with all different combinations) and with the three

channels of the colour space to discover which

combination gives us the best results. As we said in

the introduction section, we do the study at level

pixel. Once K-Means method provides us the

classification for pixels in a concrete colour space,

we compare this classification with the truth image.

The comparison is done for each pixel of the

obtained result and the same pixel in the truth image.

If the result obtained by the classifier coincides with

what the truth image says for that pixel, we have a

right detection, if our classier says that there is skin

in a pixel where there is not, we have a false positive

and finally if our classifier says that there is no skin

in a pixel where really there is, we have a false

negative.

Table 1 shows the right detection results for each

colour space. As we can see, we have focused on the

three channels of the colour spaces separately and in

the three channels all together. When we use the

three channels all together we have considered what

most channels say (e.g.: if two of the three channels

say that there is a hit in a concrete pixel, the final

result is a hit).

3 RESULTS ANALYSIS

In this section we analyse and explain the results

obtained in each colour space for skin detection. To

obtain some theoretical support about the different

colour models which are used, see biographical

references (Shapiro, 2001) and (Pratt, 2001). Due to

the number of pages of this paper, we are forced to

summarise the results obtained in our study through

table 1 (right detections rate).

3.1 RGB Model

This colour model is not very robust when there are

changes in the illumination of the images. This

explains why the channel which had more hits was

the G channel when obviously the more important

channel to find skin colour is R. The worse results in

this model were naturally obtained by B channel.

We can conclude that this colour space is not the

most appropriate one to find skin colour in an image,

although it is possible to use it with success if the

environment (illumination) is constant.

3.2 CMY Model

This colour space obtains the worse results in the

study. This fact is quite reasonable because its usage

is specified for other fields (printing more than

processing). Taking into account that this colour

space is quite similar to RGB, it is normal that the

best channel was M (because in RGB was G), just

for the same reasons.

SIGMAP 2007 - International Conference on Signal Processing and Multimedia Applications

176

Table 1: Right detections for each colour space using the three channels of the space separately and the three together.

RGB CMY YUV YIQ YCbCr YPbPr YCgCr YDbDr HSV XYZ

82.79% 82.86% 84.69% 84.69% 87.06% 84.69% 88.95% 88.95% 72.16% 87.44%

87.23% 86.6% 87.86% 89.93% 87.86% 87.85% 87.48% 90.58% 94.04% 85.12%

79.75% 82.33% 90.19% 70.32% 90.19% 90.19% 93.19% 91.42% 82.08% 74.1%

C1C2C3

86.55% 86.15% 89.7% 86.93% 89.8% 89.7% 92.63% 92.29% 95.06% 86.27%

3.3 YUV Model

The third channel of this colour space gives us quite

good results (90.19% of right detections). V channel

saves the information for the difference between the

red component of the colour and the luminance. In

this colour space, like in most of the followings,

luminance information is separated from

chrominance information, so the results obtained are

better than for RGB model because they are more

robust to brightness variations.

3.4 YIQ Model

Our conclusion for this colour space is that although

it obtains acceptable results, YUV model, which is

very similar, obtains better results. Therefore,

according to our study, it is better to use YUV

colour space than YIQ in skin detection.

3.5 YCbCr Model

This colour model is very similar to YUV. We can

state that this colour space obtains quite good results

in general for skin colour detection, specially the Cr

channel (90.19% of right detection) that saves the

information for the red component of the colour

model.

3.6 YPbPr Model

This colour space is the analog version of YCbCr

model. So, this colour model also gets the best result

for the channel Pr. We can conclude that it is

equivalent to use YPbPr model and YCbCr model,

but this last one is more used for skin detection than

YPbPr.

3.7 YCgCr Model

This colour space is a variation of typical YCbCr

model, since it uses Cg channel instead of using Cb

channel (de Dios, 2004). In this colour space, using

the three channels together, there is a great profit

(from 89.8% of YCbCr to 92.63% of YCgCr) caused

because green colour is quite better than blue colour

to detect skin colour. In fact, the global right

detections of this colour space are one of the highest

in the study (only beaten by HSV colour space).

3.8 YDbDr Model

This colour model is quite similar to the previous

ones, but it gives better results for blue channel than

the other colour models (90.58% of right detections

–the other colour spaces do not exceed 87.8% in any

case–) without loosing precision in the other

channels (specially in Dr channel, which is very

important because Red is the channel that provides

better results in skin detection). For this reason,

YDbDr colour format provides quite good results

studying the channels separately and also studying

the three channels together (92.29% of hits).

3.9 HSV Model

This colour model provides the best results in our

study. The right detection rate of this colour format

using the three channels together is 95.06%. The

best channel alone is S, which refers to saturation in

the image, which provides an average of 94.04% of

right detection rate. However, H component has a

quite low success rate (only 72.16%) for the same

reason that RGB had a quite low success rate for R

channel: some skin parts of some faces in the

database are confused with the background when the

face has some brightness. Figure 3 shows an

example of a picture of the facial database used in

our study and the output provided by the K-Means

classifier for each of the components of HSV colour

space. Skin colour detections by the classifier are

coloured in pink colour.

Figure 3: From left to right: original face, K-means

classifier output for H channel, K-means classifier output

for S channel and K-means classifier output for V channel.

COLOUR SPACES STUDY FOR SKIN COLOUR DETECTION IN FACE RECOGNITION SYSTEMS

177

3.10 CIE-XYZ Model

The results obtained with this colour space are quite

similar to the obtained using RGB. However we can

notice that XYZ provides better results for channel

R (X in this case –87.44%–) than RGB format

(82.79%). So, we can say that XYZ colour space is a

bit more robust to illumination conditions than RGB.

4 CONCLUSIONS

We have done a complete study at pixel level of 10

different colour spaces using typical images in face

recognition systems. The purpose of this study was

to perform an objective comparison among the most

used colour spaces in skin detection to discover

which colour model provides the best results. We

can group the different colour spaces into 4 different

families: RGB family (RGB and CMY), YUV

family (YUV, YIQ, YCbCr, YPbPr, YCgCr,

YDbDr), HSV family and CIE family. According to

the obtained results, the most appropriate family for

skin detection is HSV (because HSV colour format

is the winner in our study). However, there is a

component in all colour models which, in general,

provides constant and positive results. This

component is Red component (the more significant

channel for skin detection).

We can also state that luminance channel (in

colour spaces where it is separated from

chrominance. –Y in almost all channels and V in

HSV–) is not a very important channel in skin colour

detection. In fact, we can also say that colour spaces

where luminance and chrominance are separated get

better results (RGB, CMY and XYZ colour spaces

have the lowest right detection rates of all models).

All in all, the 10 colour spaces that we have

studied provide quite good results in skin colour

detection. In general, all colour models have reduced

false positives and false negatives rates (peaks

values are explained by some unlucky highlights in

the face of some people), and the right detection

rates are at least over 86% in all colour spaces, so

we can conclude that all the models can be used for

skin colour detection with more or less success and

precision (this explains why in the bibliography

there are studies using such amount of different

colour spaces).

To sum up, after doing the quantitative study

described in this paper, we can conclude that HSV

colour space is the model which gets the best results

for skin colour detection. On the other hand, there

are colour spaces that obtain quite poor results, such

as CMY, CIE-XYZ, YIQ or even RGB. In any case,

it is possible to use almost any colour space to find

skin colour because with the appropriate classifier

and some pre-processing in the images (such as

giving higher values to contrast) most colour spaces

have quite high right detections rates.

ACKNOWLEDGEMENTS

This work has been developed in part thanks to the

OPLINK project (TIN2005-08818-C04-03). José M.

Chaves-González is supported by research grant

PRE06003 from Junta de Extremadura (Spain).

REFERENCES

Albiol, A., Torres, L., et al. Optimum Color Spaces for

Skin Detection. IEEE International Conference on

Image Processing. vol 1, pp: 122-124, Oct. 2001.

Campadelli, P., Lanzarotti, R., et al. Face and facial

feature localization. Int. Conf. on Image Analysis and

Processing. vol 1, pp: 1002-1009, Oct. 2005.

de Dios, J.J., Garcia, N. Fast face segmentation in

component color space. International Conference on

Image Processing. vol. 1, pp: 191-194. Oct. 2004.

Jinfeng Y., Zhouyu F., et al. Adaptive skin detection using

multiple cues. International Conference on Image

Processing. vol. 2, pp: 901-904. Oct. 2004.

Martinez, A.M., Benavente, R. The AR Face Database.

CVC Technical Report #24, June 1998.

Naseem, I., Deriche, M. Robust human face detection in

complex color images. IEEE International Conference

on Image Processing. vol. 2, pp: 338-341, Sept. 2005.

Phung, S.L., Bouzerdoum, A., et al. Skin segmentation

using color pixel classification: analysis and

comparison. IEEE Transactions on PAMI. vol. 27, No.

1, pp.: 148-154, Jan. 2005.

Pratt, W.K. Digital Image Processing: Pinks Inside. 3rd.

edition. John Wiley & Sons, 2001.

Runsheng J., Bin K., et al. Color Edge Detection Based on

YUV Space and Minimal Spanning Tree. IEEE

International Conference on Information Acquisition.

vol. 1, pp: 941 – 945, Aug. 2006.

Shapiro, L.G., Stockman, G.C. Computer Vision. Prentice-

Hall, 2001.

Sigal, L., Sclaroff, S., et al. Skin color-based video

segmentation under time-varying illumination. IEEE

Transactions on Pattern Analysis and Machine

Intelligence. vol. 26, No. 7, pp: 862 – 877, July 2004.

Yang, M-H., Kriegman, D.J., et al. Detecting Faces in

Images: A Survey. IEEE Transactions on PAMI, Vol.

24, No. 1, pp: 34-58, Jan. 2002.

Zhao, W., Chellappa, R., et al. Face Recognition: A

literature survey. ACM Computing Surveys, Vol. 35,

No. 4, pp: 399-458, Dec. 2003.

SIGMAP 2007 - International Conference on Signal Processing and Multimedia Applications

178