Texture Learning by Fractal Compression
Benoît Dolez
1,2
and Nicole Vincent
1
1
CRIP5-SIP, René Descartes University, 45 rue des Saints Pères
75006 Paris, France
2
SAGEM DS, 178 rue de Paris, 91344 Massy, France
Abstract. This paper proposes a texture learning method based on fractal com-
pression. This type of compression is efficient for extracting self-similarities in
an image. Square blocks are studied and similarities between them highlighted.
This allows establishing a score and thus a rating of the blocks. Selecting the
blocks that best encode the biggest part of the image or a texture leads to a da-
tabase of the most representative ones. The recognition step consists in labeling
blocks and pixels of a new image. The blocks of the new image are matched
with the ones of the different texture databases. As an application, we used our
method to recognize vegetation and buildings on aerial images.
1 Introduction
Learning a set of concepts and labeling areas of a new image is a fundamental issue in
the field of image processing. Texture analysis usually follows one of these ap-
proaches: structural [1], statistical [2], model-based [3] [4], or transform [5]. Low-
level features rarely well classify complex concepts. For example, a building contains
homogenous and geometric areas. Our aim is to take this composite aspect into ac-
count by extracting the most representative blocks of the concepts samples. The
means we chose is the fractal compression. Fractal theory has been widely worked
out during the last two decades ([6], [7], [8], [9], [10]). The basis of this kind of com-
pression is the search for similarities in the image. A self-similarity corresponds to a
redundancy of information in the signal, expect for scale, contrast, luminosity or
isometry. The recognition step will be equivalent to searching the learnt blocks in the
test image. This type of approach was explored in [11] for handwriting analysis. This
paper proposes to continue in this way and extend the previous study to grey scale
images and texture learning.
Dolez B. and Vincent N. (2007).
Texture Learning by Fractal Compression.
In Proceedings of the 7th International Workshop on Pattern Recognition in Information Systems, pages 99-106
DOI: 10.5220/0002429200990106
Copyright
c
SciTePress
2 Fractal Compression
2.1 Global Principle
The fractal compression relies on Iterated Function Systems (IFS). An IFS is a set of
contracting functions
MMT
i
: in a metric space
M
. This contracting function
can be extended to the set of parts of
M
and considering the Hausdorff distance.
() ()
U
N
i
i
MPMPTT
1
:
=
=
.
(1)
The fix point theorem gives the existence and uniqueness of a subset
F of
M
so
that
FFT =)( . F is called the attractor of the IFS. In the fractal compression
field, it rarely occurs that an image is really self-similar. For this reason, we use the
principle of Partitioned IFS (PIFS): the image is partitioned, then, for each element of
the partition we aim at finding an area of the image which should correspond, apart
from a contracting transform. In our case the partition is a regular grid. Its elements
are called ranges. We have chosen them square. The zones of the image that may
correspond are called domains.
The decompression step of the image is limited to the application of the PIFS. The
initial point is an image that has the same dimension as the compressed image but any
content is convenient (average grey image for example or an ordinary image). At each
iteration, the image is transformed and converges towards the image compressed by
the PIFS. We stop the iterated process when the reconstructed image is good enough
or when there is no variation anymore from one iteration to the other. Many papers
have been published in this field during the last two decades. Some of them deal more
specially with the memory size required to encode the compressed image [12]. Others
study the optimization of correspondence search between similar elements in the
image ([10], [13], [14], [15], [16], [17]).
2.2 Contracting Transform Construction
To ensure contracting property, the magnitude of the scale factor between a domain
and a range, and the contrast parameter must be lesser than one. Finding the contract-
ing transform is equivalent to find, for each couple of range
R
and domain D , the
parameters that best match the following expression:
(
)
(
)
lciIsosDresisoR
+
.,, .
(2)
where
()
sDred , is the result of scaling D by a s factor,
(
)
iIsoBiso , is the ap-
plication of an isometry referred by
iIso on image block
B
. c and l are contrast
and luminance parameters and can be computed by least square method. To respect
the square shape of the blocks we limit the considered isometries to the eight possi-
bilities of rotations and symmetries (
[
]
8..1
iIso ). In practice, the scale factor can
100
be fixed to
2
1
as explained by Jacquin in [7]. We say that domain D encodes range
R
if the matching between D and
R
, by
(
)
lcsiIsoRDT ,,,,,
=
, is good enough,
that is to say they can be considered as similar according to some given criterion:
PSNR, RMS, etc.
3 Learning Process
3.1 Domain Score Definition
The compression phase allows knowing which part of the image encodes which other
one. To define the score of a domain, the main idea is to know how many ranges may
encode the domain, according to some reconstruction error threshold
S . The recon-
struction error of a range
R
by a domain D according to a given transform
T
is
defined by
()
(
)
(
)
(
)
[
]
RlciIsosDresisodRDEr
lciIsos
,.,,min,
,,,
+
=
.
(3)
Where
()
.,.d is the RMS measure between two blocks.
Then, we have
(
)
()
=
R
SRDEr
SDscore
,
,
δ
.
(4)
where
=
otherwise
trueisPif
P
0
1
δ
So we define a score to rate domains. The score of
D is the number of ranges
R
that verify
()
SRDEr , . We may notice that when contrast is low, the matching
between blocks cannot be considered as representative. As natural images are coher-
ent, the domain score map associated with the image varies practically in a continu-
ous way. Then we can say the pertinence of a domain is not located exclusively on
this domain location but expands to its neighborhood. This phenomenon ensures the
robustness of our method, regardless of small variations according to the choice of a
partition of the initial image.
We have just seen how to rate domains. This allows us to establish an order relation
among the domains and gives us a choice criterion, so that the learning process leads
to a good modeling of the image or texture.
3.2 Selection of Representative Domains
We want to characterize textures through most representative domains. A quite natu-
ral idea would be to select domains with high scores. From the previous remark, we
101
notice two neighboring domains may have similar scores and would be too redundant
to be both kept. They may encode parts of the same zone of the image. In fact, we
want to encode the biggest possible area of the image with the smallest number of
domains. If we want a good quality of reconstruction, we must store a huge number
of domains. Otherwise, if we do not want our learning to be too specific, we must
select few of them. So, the selection of representative domains was made in two
steps:
Computation of the threshold
S according to the needed percentage of cov-
erage.
Iterative selection of the best domains, according to reconstruction thresh-
old
S
.
Thanks to the matrix
(.,.)Er
we know the minimum threshold needed for a particu-
lar range be encoded by at least one domain. We compute
S
so that each element of
a set of ranges, which represents the needed coverage percentage, can be encoded by
at least one domain. Then we proceed as follow for the iterative selection of the do-
mains:
1. All ranges are marked not encoded
2. We compute the score of each domain according to
S , taking into account
only non-encoded ranges
3. The domain with the best score is selected and stored. All the ranges it en-
coded are marked encoded.
Repeat 2 and 3 until the needed percentage is reached.
3.3 Inter Class Discrimination
In order to achieve the segmentation of an image, several textures have to be learnt
using the previous method. When learning a texture, some domains can be ambiguous
as they allow encoding parts of other textures. These domains may be prejudicial to
the discrimination. To suppress this ambiguity, we try to reconstruct a texture (i.e. its
ranges) with domains coming from other textures. A domain from a texture is consid-
ered as ambiguous if it allows encoding more than a given percentage of ranges of
another texture. The ambiguous domains are suppressed from the learning data base.
4 Recognition and Experimentation
4.1 Global Method
For each learnt texture, we have a set of representative domains. We want to use these
blocks to label each pixel of a new image. To do so, we compare each square part of
the image with each domain of each concept. This comparison relies on a distance
between one domain of a concept and the block of the considered image (this distance
will be defined below). Thus, for a new image, we compute a distance map (same
dimension as the image) to each texture or concept. To increase the reliability of our
102
results, we included some neighborhood information: one can be more confident in an
area reconstructed with many different domains of a concept rather than another area
reconstructed with a very few of them. This parameter is called richness of domains.
Let
i
c be a learned concept and
p
a pixel on the test image,
(
)
i
cprich , is the
number of different domains of
i
c used to reconstruct the neighborhood of
p
. The
more the richness is high and the distance is small, the more we are confident in the
labeling of the considered area.
4.2 Distance Definition for the Recognition Task
The principal originality of this study is the learning process. Once the reference
block database has been computed for each concept, the local recognition task can be
summarized as a classical search of the nearest element between the blocks of the test
image and each domain of each concept. The local distance is computed by taking the
normalized best output of a neural network, which take simple features as input (gra-
dient orientation, histogram richness, fractal dimension, standard deviation). When
using both richness and local distance parameter, we must normalize them in order to
compute a coherent value. Let
p
be a particular pixel. For each class
i
c we know
the normalized distance
()
i
cpdist , and the normalized richness
(
)
i
cprich , . The
final distance of
p
is defined as
() () ()
(
)
2
2
,1,,
iiifinal
cprichcpdistcpd +=
.
(5)
And so the label of
p
will be:
()
(
)
(
)
ifinali
cpdplabel ,minarg
=
.
(6)
4.3 Learning Data and Test Results
We have chosen to learn two classes: building and vegetation. To illustrate our
method we will learn 4 classes. Each learning sample contains between 4000 and
18000 domains. We will store the up to 30 most representative domains for each
class.
103
Fig. 1. Learning data base sample. Areas 1 & 2 contain vegetation; areas 3 & 4 contain build-
ings.
Fig. 2. A test image and its ground truth of the test image: white for buildings, black for vege-
tation.
Fig. 3. Buildings recognition without (upper) and with (lower) richness parameter on the left
and right respectively.
1
2
3
4
104
Table 1. Confusion matrix without richness parameter.
Vegetation Buildings
Vegetation 98.6 1.4
Buildings 45.4 54.6
Table 2. Confusion matrix with richness parameter.
Vegetation Buildings
Vegetation 98.6 1.4
Buildings 41.8 58.2
These confusion matrixes are computed pixel per pixel, which is not very favorable to
estimate small object segmentation. So we may notice that:
100% of the connected building regions have been detected.
There is only 1 false alarm (on the bottom right of the image) for building
recognition with the richness parameter.
Small isolated buildings are still detected when using the richness parameter.
5 Conclusion & Perspectives
We have presented an original idea for texture concept learning using the principle of
fractal compression. This type of learning takes only a few parameters as inputs (per-
centage of texture coverage, number of maximum stored domains) that do not need
any specialist’s skills to be set. Then we have seen a simple recognition step and
some test results. We highlighted the gain of reliability when taking into account the
richness of the classes’ elements used to reconstruct an area. As this is some of our
first test results, the test database size must still be increased to qualify our approach.
Concerning the recognition step, others solutions are available, using for example the
normalized inter correlation to precisely locate similar domains in a straight forward
way.
References
1. Haralick, R.: Statistical and Structural Approaches to Texture, Proc. IEEE, Vol. 67, n°5,
pp. 786-804 (1979).
2. Strzelecki, M.: Segmentation of Textured Biomedical Images Using Neural Networks, PhD
Thesis (1995).
3. Pentland A.: Fractal-Based Description of Natural Scenes, IEEE Trans. Pattern Analysis
and Machine Intelligence, Vol. 6, n°6, pp.661-674 (1984).
4. Kaplan, L., Kuo, C-C.: Texture Roughness Analysis and Synthesis via Extended Self-
similar (ESS) Model, IEEE Trans Pattern Analysis and Machine Intelligence, Vol. 17,
n°11, pp. 1043-1056 (1995).
5. Lepisto, L., Kunttu, I., Autio, J., Visa, A.: Classification method for colored natural tex-
tures using Gabor filtering, Image Analysis and Processing, Proceedings 2003, pp. 397 –
401.
105
6. Ghazel, M., Freeman, G.H., Vrscay, E.R.: Fractal Image Denoising, IEEE Transactions on
Image Processing, Vol. 12, N° 12 (2003).
7. Jacquin, A.: Fractal Image Coding Based on a Theory of Iterated Contractive Image Trans-
formations, Proc. SPIE Visual Comm. And Image Proc., pp. 227-239 (1990).
8. Jacquin, A.: Image Coding Based on a Fractal Theory of Iterated Contractive Image Trans-
formations, IEEE Transactions on Image Processing, Vol. 1, pp. 18-30 (1992).
9. Fisher, Y.: Fractal Image Compression – Theory and Application. New York : Springer-
Verlag (1994).
10. Maria, M.R.J., Benoît, S., Macq, B.: Speeding up fractal image coding by combined DCT
and Kohonen neural net method, ICASSP'98 - IEEE Intl Conference on Acoustics, Speech
and Signal Processing, Proc., pp. 1085-1088 (1998).
11. Vincent, N., Seropian A., Stamon, G.: Synthesis for handwriting analysis, Pattern Recogni-
tion Letters n°26, pp. 267-275 (2004).
12. Chang, H.T.: Gradient Match and Side Match Fractal Vector Quantizers for Images, IEEE
Transactions on image processing, Vol. 11, n°1, January 2002.
13. Lai, Lam, Siu: A fast image coding based on kick-out and zero contrast conditions, IEEE
Transactions on image processing, Vol. 12, n°11 (2003).
14. Hamzaoui, R.: Codebook clustering by self-organizing maps for fractal image compression,
Institut für Informatik – Report 75 (1996).
15. Wohlberg, B.E., De Jager, G.: Fast Image Domain Fractal Compression by DCT Domain
Block Matching, Electronics Letters, Vol. 31, pp. 869-870 (1995).
16. Distasi, R., Nappi, M., Riccio, D.: A Range / Domain Approximation Error-Based Ap-
proach for Fractal Image Compression, IEEE Transactions on Image Processing, Vol. 15,
N° 1, (2006).
17. Barthel, K.U., Schüttemeyer, J., Voyé, T., Noll, P.: A New Image Coding Technique Uni-
fying Fractal and Transform Coding, IEEE International Conference on Image Processing,
(1994).
106