Data Based Color Constancy
Wei Xu, Huaxin Xiao, Yu Liu and Maojun Zhang
College of Information System and Management, National University of Defense Technology, Changsha, China
Keywords:
Color Constancy, Color Gamut, Canonical Illuminant, Data Driven, Kernel Method.
Abstract:
Color constancy is an important task in computer vision. By analyzing the image formation model, color
gamut data under one light source can be mapped to a hyperplane whose normal vector is only determined by
its light source. Thus, the canonical light source is represented through the kernel method, which trains the
color data. When an image is captured under an unknown illuminant, the image-corrected matrix is obtained
through optimization. After being mapped to the high-dimensional space, the corrected color data are best fit
for the hyperplane of the canonical illuminant. The proposed unsupervised feature-mining kernel method only
depends on the color data without any other information. The experiments on the standard test datasets show
that the proposed method achieves comparable performance with other state-of-the-art methods.
1 INTRODUCTION
Under different color light sources, objects present
different colors. Human eyes can automatically adapt
to the light changes that retain the perception of ob-
ject color. This phenomenon is known as color con-
stancy. However, given the limitation of image sen-
sors, the image captured by digital imaging devices
does not have color constancy. Color is one of the
most important features of vision. Many computer
vision applications need color information, such as
image segment (Zhuang et al., 2012), object recog-
nition and tracking (Bousetouane et al., 2013), and
image retrieval (Stottinger et al., 2012). The object
color variation caused by a light source influences the
robustness of the above algorithms. Thus, the color
constancy algorithm is important in computer vision.
Many color constancy methods have been pro-
posed in the past decades. Some of these methods
directly set constraints or assumptions on the scene,
such as Grey World (Buchsbaum, 1980), White Patch
(Land, 1977), Shades of Grey (Finlayson and Trezzi,
2004), Grey Edge (van de Weijer et al., 2007), Grey
Block-Differencing (Lai et al., 2013), and Gamut
Mapping (Forsyth, 1990; Gijsenij et al., 2010). Other
methods seek light source-related prior information
by machine learning to predict the unknown illumi-
nant. Cardei et al. (Cardei et al., 2002) employed
the Neural Network method. Although this method
seems to be a reasonable solution, it lacks a detailed
description of the problem. In practice, its gener-
alization ability is poor. Bayesian statistical theory
(Brainard and Freeman, 1997; Finlayson et al., 2001;
Rosenberg et al., 2004; Gehler et al., 2008) is also
commonly used, which determines the specific prob-
ability distribution as prior knowledge. However, ac-
curately describing this distribution is difficult. Funt
et al. (Funt and Xiong, 2004) used Support Vec-
tor Regression (SVR). Their training set consists of
the binarized chromaticity histograms of many im-
ages. Chakrabarti et al. (Chakrabarti et al., 2008;
Chakrabarti et al., 2012) attempted to develop a statis-
tical model for the spatial correlations between pixels.
In this study, we examine the color gamut data
from a new perspective. By analyzing the image for-
mation model, the color gamut data can be mapped
to a hyperplane whose normal vector is determined
by the light source. This hyperplane unifies the dis-
orderly color gamut data. Thus, the color data cap-
tured under the canonical illuminant are trained to
implicitly obtain the features of the canonical light
source. The training method is an unsupervised fea-
ture mining kernel method, such as SVR, which is
a supervised method. Our algorithm uses the entire
color gamut during training, which is different from
the gamut mapping method (Finlayson et al., 1993)
that uses the data only on the convex hull. When cor-
recting an image under an unknown illuminant, the
corrected matrix is computed by optimization. The
color data of the corrected image can be best fit for the
hyperplane of the canonical illuminant after mapping
to a high-dimensional space. Considering that this
method only depends on the color data without any
Xu, W., Xiao, H., Liu, Y. and Zhang, M.
Data Based Color Constancy.
DOI: 10.5220/0005698104310436
In Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2016), pages 431-436
ISBN: 978-989-758-173-1
Copyright
c
2016 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
431
other information, it is called the data-driven color
constancy method in this study.
The rest of this paper is organized as follows.
Section 2 presents our algorithm, which consists of
analysing the image formation model. Section 3 tests
our method on the standard test datasets and discusses
the results. Finally, Section 4 summarizes this work
and discusses future research.
2 PROPOSED METHOD
2.1 Analysis of the Color Gamut Data
Similar to most color constancy algorithms, we as-
sume that the scene only has one light source and the
object reflection is diffusion, which conforms to the
ideal Lambertian model (Barnard, 1999). According
to the Lambertian reflection model, image formation
depends on three factors: the light source L(λ), sur-
face reflectance properties S(x,λ), and camera sensi-
tivity C(λ). Thus:
ρ
ρ
ρ
µ
(x) =
Z
ω
S(x,λ)L(λ)C(λ)dλ, (1)
where x is the spatial coordinate of the image; λ is
the spectrum wavelength; ω is the visible wavelength
range; C(λ) = [C
r
(λ),C
g
(λ),C
b
(λ)]
T
is the spectrum
response function of the camera for the three color
bands (i.e., red, green, and blue); and ρ
ρ
ρ
µ
(x) is the
RGB value at point x under the unknown illuminant,
namely, ρ
ρ
ρ
µ
(x) = [ρ
µ
r
(x),ρ
µ
g
(x),ρ
µ
b
(x)]
T
.
The aim of color constancy is to adjust the image
obtained under an unknown illuminant to the canoni-
cal illuminant, as shown in the following equation:
ρ
ρ
ρ
c
(x) = Λ
µ,c
ρ
ρ
ρ
µ
(x). (2)
where ρ
ρ
ρ
c
(x) is the image in the canonical illuminant
and Λ is a constant diagonal matrix (Finlayson et al.,
1993).
Based on a previous study (F.-H. Cheng and Chen,
1998), spectral power distribution can be expressed as
follows:
L(λ) =
n
i=1
e
i
E
i
(λ), (3)
where E
i
(λ) is the basis function used to describe
the spectral power distributions of illuminants, e
i
is
the corresponding weight, and n is the number of ba-
sis functions. Substituting L(λ) in Equation (1) with
Equation (4) yields the following:
ρ
ρ
ρ(x) =
n
i=1
e
i
Z
ω
E
i
(λ)S(x,λ)C(λ)dλ
. (4)
For each component of ρ
ρ
ρ(x), the following equa-
tion is derived:
1 =
n
i=1
e
i
Z
ω
E
i
(λ)S(x,λ)C
k
(λ)
ρ
k
(x)
dλ
. (5)
where k = r,g,b.
Summing up the red, green, and blue components
yields the following:
n
i=1
e
i
Z
ω
k=r,g,b
E
i
(λ)S(x,λ)C
k
(λ)
3ρ
k
(x)
dλ
+1 = 0. (6)
It can be abbreviated in hyperplane form as fol-
lows:
< w,χ > +1 = 0, (7)
where < ·,· > is the inner product, w = [e
1
,e
2
,· · · , e
n
],
and χ =
h
R
ω
k=r,g,b
E
i
(λ)S(x,λ)C
k
(λ)
3ρ
k
(x)
dλ
n
i=1
i
T
. The
function Φ that can map the color gamut data to the
n-dimensional space H is then defined. If it satisfies:
Φ([ρ
r
(x),ρ
g
(x),ρ
b
(x)]) =
h
Z
ω
k=r,g,b
E
i
(λ)S(x,λ)C
k
(λ)
3ρ
k
(x)
dλ
i
n
i=1
,x, (8)
any pixel in the image can project to the hyperplane,
which is determined by the normal vector w. The nor-
mal vector w is only related to the illuminant parame-
ter e
i
, which represent the light source characteristics
2.2 Estimation of Illuminant
Parameters
Suppose the color gamut of the known canonical il-
luminant c is
c
= {ρ
ρ
ρ
c
(i)}
i=1,2,···,P
, where P is the
number of data in the color gamut, and ρ
ρ
ρ
c
(i) is the
RGB value of some data i. After mapping the func-
tion Φ, the new data χ
c
i
= Φ(ρ
ρ
ρ
c
(i)) satisfy the corre-
sponding hyperplane Γ
c
of the light source. Thus:
Γ
c
:< w
c
,χ
c
i
> +1 = 0, (9)
where normal vector w
c
is the illuminant characteris-
tic.
The kernel function can denote the inner product
of the high-dimensional space in the original low-
dimensional space, i.e., K(a,b) =< Φ(a),Φ(b) >.
Thus, the kernel function is employed to approximate
Φ.
Similar to the derivation of SVR, the loss function
ξ = L(0,< w
c
,χ
c
i
> +1) is defined. The target is to
minimize the total loss as follows:
w
c
= argmin
w
c
P
i=1
ξ
i
(10)
s.t. ξ
i
= L(0,< w
c
,χ
c
i
> +1).
ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods
432
The ε-insensitive loss function is used to deal with
the noise in the observed data as follows:
L(0,< w
c
,χ
c
i
> +1) =
|
< w
c
,χ
c
i
> +1
|
ε
. (11)
In general, the observed data contain noise. Reg-
ularization is utilized in controlling the capacity of
the function cluster to avoid over-fitting. In particu-
lar, the magnitude of
k
w
c
k
2
is restricted, where
k
·
k
is a 2-norm operator, and the variable η is used as
the weight to leverage the capacity and errors. The
problem is then transformed to solve the minimiza-
tion problem of the optimization objective function as
follows:
w
c
= argmin
w
c
1
2
k
w
c
k
2
+ η
P
i=1
(ξ
i
+
b
ξ
i
) (12)
s.t. < w
c
,χ
c
i
> +1 ε + ξ
i
< w
c
,χ
c
i
> +1 ε
b
ξ
i
ξ
i
,
b
ξ
i
0,i = 1,2,· · · ,P
where ξ
i
and
b
ξ
i
are relaxation factors, and the constant
η > 0 indicates the punishment for the excess of the
error ε .
By Lagrange duality theory and the introduction
of the kernel function, Eq.(12) can be converted to
the following quadratic programming problems:
(
b
α,α) =
argmin
b
α,α
1
2
P
i=1
P
j=1
(
b
α
i
α
i
)(
b
α
j
α
j
)K(ρ
ρ
ρ
c
(i),ρ
ρ
ρ
c
( j))
+ε(
P
i=1
(
b
α
i
+ α
i
) +
P
i=1
(
b
α
i
α
i
)) (13)
s.t.0
b
α
i
η
0 α
i
η
where K(·,·) is the kernel function. The Gaus-
sian radial basis kernel function K(a,b) =
exp(
k
a b
k
2
/2(σ
2
)) is selected in our method.
From the derivation of Eq.(13), the illuminant
characteristic can be obtained as follows:
w
c
=
P
i=1
(
b
α
i
α
i
)Φ(ρ
ρ
ρ
c
(i)). (14)
The illuminant can be implicitly expressed by the
parameters
b
α
i
and α
i
. Although w
c
still relates to the
mapping of Φ(ρ
ρ
ρ
c
(i), the next section discusses that
the corrected matrix can be obtained only if
b
α
i
and α
i
are known.
According to the feature of the support vectors,
the obtained w
c
is sparse because of the use of
the insensitive loss function. Although w
c
is P -
dimensional (i.e., P is the number of data in the color
gamut), many zeros exist in w
c
. The number of non-
zeros is exactly the number of support vectors.
During training, given the large amount of color
gamut data (i.e., hundreds of thousands of points),
the memory consumption is very large. Thus, we de-
signed an iterative algorithm. In Step 1, 1000 points
are uniformly chosen as the subset for training, and
their hyperplane is obtained. In Step 2, 150 points
that have the maximum distance to the hyperplane are
selected and added to the subset to train a new hyper-
plane. Step 2 is repeated several times to obtain the
best hyperplane.
2.3 Estimation of the Corrected Matrix
Given an image ρ
ρ
ρ
µ
(x) under an unknown illuminant
µ, the correction employs the diagonal matrix Λ
µ,c
=
diag(f) according to Eq.2. After the correction, the
image color gamut in high-dimensional space can be
best fit for the canonical illuminant hyperplane Γ
c
.
This optimization problem is as follows:
Λ
µ,c
=
argmin
Λ
µ,c
Q
j=1
(< w
c
,Φ(Λ
µ,c
ρ
ρ
ρ
µ
( j)) > +1)
2
k
w
c
k
2
, (15)
where ρ
ρ
ρ
µ
( j) is the RGB value at point j in the un-
known illuminant color gamut, and Q is the number
of data in the color gamut. When Λ
µ,c
= 0, Eq.15 is
clearly degenerated and constantly established. Thus,
the constraint is required. In our method, the con-
straint is [Λ
µ,c
]
T
Λ
µ,c
3. Its physical meaning is ob-
tained after the correction, and the image brightness
remains high. Λ
µ,c
= 0 indicates that the image turns
black after the correction. Given that w
c
can be ex-
pressed as Eq.14 and
k
w
c
k
2
is a constant, as shown
in the previous section, Eq.15 can be rewritten as fol-
lows:
Λ
µ,c
= argmin
Λ
µ,c
Q
j=1
P
i=1
b
α
i
α
i
K
Λ
µ,c
ρ
ρ
ρ
µ
( j), ρ
ρ
ρ
c
(i)
+ 1
2
s.t. [Λ
µ,c
]
T
Λ
µ,c
3 (16)
Eq.16 is a nonlinear optimization problem.
The damped GaussNewton method (Subramanian,
1993; Pan and Chen, 2009) is used to solve this prob-
lem, whose convergence is fast and accurate. Approx-
imately 10 iterations are sufficient in our application.
More importantly, the global unique solution can be
obtained.
Data Based Color Constancy
433
3 EXPERIMENTS AND ANALYSIS
Figure 1: Camera used for the SFU Grey-ball dataset col-
lection.
The performance of the proposed algorithm is tested
in this section. We primarily performed the experi-
ment on the SFU Grey-ball dataset proposed by Funt
et al. (Ciurea and Funt, 2003). This dataset consists
of 11,346 consecutive images captured with a video
camera. The video contains various indoor and out-
door scenes under different illuminant. The ground
truth is captured using a grey sphere mounted on the
bottom of the camera (Fig.1), so it is always visible
inside the images. Some sample images are shown in
Fig.2. The ground truth illuminant of each image and
test results of some state-of-the-art methods for the
full set are published on the color constancy research
website (Gijsenij and Gevers, ).
Figure 2: Sample images in the SFU Grey-ball dataset.
The angular error between the estimated illumi-
nant vector e
ε
and the ground truth vector e
i
is em-
ployed to provide quantitative quality evaluations as
follows:
e
arg
= cos
1
(
e
i
· e
ε
k
e
i
kk
e
ε
k
) (17)
Suppose the corrected matrix Λ
µ,c
= diag(l,m, n).
The estimated illuminant vector e
ε
is computed as
e
ε
= (m/l,1,m/n). Here the quotiety of green compo-
nent is set to 1. The mean and median angular errors
are selected for performance evaluation.
3.1 Experiment Details
We randomly selected 50 images from the training
dataset of the canonical illuminant. These images in-
clude various scenes under different light sources cor-
rected by the ground truth before training. Based on
these images, the color gamut (approximately 40,000
points) of the canonical illuminant can be drawn
(Fig.3).
Figure 3: Color gamut of the canonical illuminant.
Three parameters (i.e., λ, ε, and σ) exist in the
training process (Eq.(13)). They are empirically set
according to their meanings. ε, which is the degree of
the tolerance to noise, is set at ε = 0.0001. λ, which is
the degree of the punishment for the excess of error ε,
is set at λ = 1. σ, which is the parameter of the Gaus-
sian radial basis kernel, is set at σ = 25. The run-time
of the training takes 20-30 min on our PC (CPU: 3.10
GHZ, OS: Windows 7 64-bit, RAM: 6 GB). Once the
illuminant characteristic w
c
is obtained, it can be used
for the correction of all images.
3.2 Experiment Results
We uniformly selected approximately 1000 images
(except the training images) from the dataset to act as
the estimation set. The corrected matrix was obtained
by the optimization of Eq.(16). Given that the grey
ball in the image is used to record the actual illumi-
nant, it needs to be masked to avoid its influence. We
uniformly selected 25% of the color gamut data un-
der an unknown illuminant for the optimization of the
corrected matrix to decrease the computation time. In
this case, the run-time of the optimization requires 3-5
min.
Several typical color constancy methods were se-
lected as the comparison methods, namely, the Grey
world (Buchsbaum, 1980), Shades of Grey (Finlayson
and Trezzi, 2004), second-order Grey Edge (van de
Weijer et al., 2007), SVR (Funt and Xiong, 2004),
Generalized Gamut Mapping (Gijsenij et al., 2010),
and Spatial Correlations (Chakrabarti et al., 2012).
ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods
434
Here some methods are use training set too. We
set the training conditions according to the authors
recommendation in the paper. As for SVR (Funt
and Xiong, 2004), 3D data (chromaticity plus inten-
sity) histogram is used. The radial basis kernel func-
tion is also chosen with the insensitivity parameter
ε = 0.00001, punishment value η = 0.1, and shape pa-
rameter γ = 0.025. A training set of 45000 histograms
from 6000 images is employed. As for Generalized
Gamut Mapping (Gijsenij et al., 2010), the 1st-order
data (i.e. edges) is used. The training set also has
6000 images from 15 different typical scenes. The
standard deviation of the Gaussian smoothing filter
is 3. As for Spatial Correlations (Chakrabarti et al.,
2012), the maximum likelihood estimator is engaged.
The training set also has 6000 images like that of Gen-
eralized Gamut Mapping.
Table 1: Angular errors for different color constancy meth-
ods on the SFU Grey-ball dataset.
Method Mean Median
Grey World 8.7
8.4
Shades of Grey 6.9
6.4
Second-order Grey Edge 6.3
5.6
SVR 7.0
6.0
Generalized Gamut Mapping 6.0
4.7
Spatial Correlations 5.4
4.1
Proposed Method 5.2
3.3
Figure 4: Sample comparison results of the SFU Grey-ball
dataset. The angular error is listed on the grey ball.
Table 1 shows a comparison of the results. It indi-
cates that our method is superior to the other methods.
Some sample comparison results are shown in Fig. 4.
Take the images in Fig. 4 for example, the influence
of the number of images in training set is shown in
Fig. 5. The correction performance enhances with
more training images. About fifty images are required
to obtain stable results.
4 CONCLUSIONS
We used the unsupervised kernel method, which is
similar to the supervised learning method of SVR. For
the image under an unknown illuminant, the corrected
Figure 5: The angular error curve of the number of images
in training set in fig. 4.
matrix is obtained by the optimization method. The
optimized objective is that the corrected color data
can satisfy the hyperplane of the canonical illuminant
after being mapped to a high-dimensional space. Af-
ter the correction, the sum of the distance from each
point to the hyperplane has a minimum value. The
test results on the SFU Grey-ball dataset show the ef-
fectiveness of the proposed method. The superiority
of the proposed method is also supported by the anal-
ysis. The slice chart of the distance from the points in
the RGB space to the hyperplane of the canonical il-
luminant shows that the method meets the grey world
assumption. However, the proposed method differs
from strong constraint methods, such as Grey World,
which does not add these types of constraints to each
scene. Therefore, better accuracy is achieved by this
model, and a large area of monochromatic objects can
be handled better.
The future work mainly focuses on two aspects.
One is the computational efficiency. The run-time of
the corrected matrix estimation requires several min-
utes. It limits the real-time application for digital
imaging devices. Moreover, the training is also time
consumed. The other aspect is the management of
complex illumination environment. The ideal Lam-
bertian model only deals with diffuse reflection. Al-
though it is the most common cases, the specular re-
flection unavoidably appears in many scenes. An-
other complex illumination environment is multiple
light source scenes. The effectiveness of the proposed
approach could be influenced by these complex illu-
mination, since its assumption is only the Lambertian
model. The extension to the complex cases is another
difficult but significant work.
Data Based Color Constancy
435
ACKNOWLEDGEMENTS
This research was partially supported by National
Natural Science Foundation (NSFC) of China under
project No. 61403403 and First Class General Finan-
cal Grant of Chian Postdoctoral Science Foundation
under project No.2015M52707.
REFERENCES
Barnard, K. (1999). Dr. Thesis, Practical Color Constancy.
Simon Fraser University, Vancouver.
Bousetouane, F., Dib, L., and Snoussi, H. (2013). Improved
mean shift integrating texture and color features for
robust real time object tracking. The Visual Computer,
29(3):155–170.
Brainard, D. H. and Freeman, W. T. (1997). Bayesian color
constancy. Journal of the Optical Society of American
A: Optics and Image Science, and Vision, 14(7):1393–
1411.
Buchsbaum, G. (1980). A spatial processor model for ob-
ject color perception. Journal of the Franklin Institute,
310(1):337–350.
Cardei, V., Funt, B., and Barnard, K. (2002). Estimating
the scene illumination chromaticity using a neural net-
work. Journal of the Optical Society of American A:
Optics and Image Science, and Vision, 19(12):2374–
2386.
Chakrabarti, A., Hirakawa, K., and Zickler, T. (2008). Color
constancy beyond bags of pixels. In Computer Vision
and Pattern Recognition. IEEE.
Chakrabarti, A., Hirakawa, K., and Zickler, T. (2012). Color
constancy with spatio-spectral statistics. Pattern Anal-
ysis and Machine Intelligence, IEEE Transaction on,
34(8):1509–1519.
Ciurea, F. and Funt, B. (2003). A large images database
for color constancy research. In 11th Color Imaging
Conference, pages 160–164.
F.-H. Cheng, W.-H. H. and Chen, T.-W. (1998). Recovering
colors in an image with chromatic illuminant. Image
Processing, IEEE Transactions on, 7(11):1524–1533.
Finlayson, G., Drew, M., and Funt, B. (1993). Diagonal
transforms suffice for color constancy. In Int. Conf.
Computer Vision. IEEE.
Finlayson, G. and Trezzi, E. (2004). Shades of grey and
color constancy. In 12th Color Imaging Conference,
pages :37–41.
Finlayson, G. D., Hordley, S. D., and Hubel, P. M. (2001).
Color by correlation: A simple, unifying framework
for color constancy. Pattern Analysis and Machine In-
telligence, IEEE Transaction on, 23(11):1209–1221.
Forsyth, D. A. (1990). A novel algorithm for color con-
stancy. Int. J. Computer Vision, 5:5–36.
Funt, B. and Xiong, W. (2004). Estimating illumination
chromaticity via support vector regression. In 12th
Color and Imaging Conference final program and
Proceedings, pages 47–52.
Gehler, P. V., Rother, C., Blake, A., Minka, T., and Sharp, T.
(2008). Bayesian color constancy revisited. In Com-
puter Vision and Pattern Recognition. IEEE.
Gijsenij, A. and Gevers, T. Color constancy research
website on illumination estimation. http:// colorcon-
stancy.com.
Gijsenij, A., Gevers, T., and Weijer, V. D. (2010). General-
ized gamut mapping using image derivative structures
for color constancy. Int. J. Computer Vision, 86(2-
3):127–139.
Lai, S., Tan, X., Liu, Y., Wang, B., and Zhang, M. (2013).
Fast and robust color constancy algorithm based on
grey block-differencing hypothesis. Optical review,
20(4):341–347.
Land, E. (1977). The retinex theory of color vision. Scien-
tific American, 237(6):108–128.
Pan, S. and Chen, J.-S. (2009). A damped gauss-newton
method for the second-order cone complementarity
problem. Applied Mathematics and Optimization, 59.
Rosenberg, C., Minka, T., and Ladsariya, A. (2004).
Bayesian color constancy with non-gaussian models.
In In Advances in Neural Information Processing Sys-
tems (NIPS). Cambridge MA, MIT Press.
Stottinger, J., Hanbury, A., Sebe, N., and Gevers, T. (2012).
Sparse color interest points for image retrieval and ob-
ject categorization. Image Processing, IEEE Transac-
tions on, 21(5):2681–2692.
Subramanian, P. K. (1993). Gauss-newton methods for the
complementarity problem. J. Optimization theory and
applications, 77(3).
van de Weijer, J., Gevers, T., and Gijsenij, A. (2007).
Edge based color constancy. Image Processing, IEEE
Transactions on, 16(9):2207–2214.
Zhuang, H., Low, K., and Yau, W. (2012). Multichannel
pulse-coupled-neural-network-based color image seg-
mentation for object detection. Industrial electronics,
IEEE Transactons on, 59(8):3299–3308.
ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods
436