Employing Wavelet Transforms to Support

Content-Based Retrieval of Medical Images

Carolina W. da Silva, Marcela X. Ribeiro, Agma J. M. Traina and Caetano Traina Jr.

Computer Science Department, University of S˜ao Paulo at S˜ao Carlos

Cx. Postal 668, CEP 13560-970 , S˜ao Carlos, SP, Brazil

Abstract. This paper addresses two important issues related to texture pattern

retrieval: feature extraction and similarity search. We use discrete wavelet trans-

forms to obtain the image representation from a multiresolution point of view.

Features of approximation subspaces compose the feature vectors, which suc-

cinctly represent the images in the execution of similarity queries. Wavelets and

multiresolution method are used to overcome the semantic gap that exists be-

tween low level features and the high level user interpretation of images. It also

deals with the “curse of dimensionality”, which involves problems with a similar-

ity deﬁnition in high-dimensional feature spaces. This work was evaluated with

two different image datasets and the results show an improvement of up to 90%

for recall values up to 65%, in the query results using the Daubechies wavelet

transform when comparing to other wavelets and gray level histograms.

1 Introduction

Content-based image retrieval (CBIR) is a technology that employs methods and algo-

rithms aiming at accessing pictures by referencing image patterns rather than alphanu-

merical indices. In order to allow a fast query answer, representative numerical features

that serve as image signatures are extracted from each image in the repository. Then,

the images are indexed using these precomputed signatures. In the query execution, the

signature extracted from the query example is compared to the signature precomputed

from all images in the database [1].

Techniques for content-based access into medical image repositories are a subject

of high interest in recent research, and remarkable efforts have been reported so far. In

particular, CBIR for picture archiving and communication systems (PACS) can make a

signiﬁcant positive impact in health informatics and health care. However, in spite of

the reports of innovations, the practical use of CBIR in PACS has not been established

yet. The reasons are manifold, and they are identiﬁed only informally, without an ob-

jective measure for evaluating the CBIR systems and identifying the shortcomings (or

gaps) in the methods. In general, two gaps have been identiﬁed in CBIR techniques:

(i) the semantic gap between the low-level features (color, texture and shape) that are

automatically extracted by machine and the high-level concepts of human vision and

⋆

We are thankful to FAPESP, CNPq and CAPES.

W. da Silva C., X. Ribeiro M., J. M. Traina A. and Traina Jr. C. (2008).

Employing Wavelet Transforms to Support Content-Based Retrieval of Medical Images.

In Proceedings of the 8th International Workshop on Pattern Recognition in Information Systems, pages 19-28

Copyright

c

SciTePress

image understanding; and (ii) the sensory gap between the object in the world and the

information in a (computational) description derived from a recording of that scene [1].

Basically, all systems use the assumption of equivalence of an image and its rep-

resentation in the feature space. These systems often use measurement systems, such

as the easily understandable Euclidean vector space model for measuring distances be-

tween a query image (represented by its features) and possible results, representing

all images as feature vectors in an n-dimensional vector space. Nevertheless, metrics

have been shown to not correspond well to the human visual perception. Several other

distance measures do exist for the vector space model such as the city-block distance,

the Mahalanobis distance or a simple histogram intersection. Still, the use of high-

dimensional feature spaces has shown to cause problems. Also, caution should be taken

when choosing the distance measure in order to retrieve meaningful results. These prob-

lems with a similarity deﬁnition in high-dimensional feature spaces is also known as the

“curse of dimensionality”, and has also been discussed in the domain of medical imag-

ing [2].

Beyer et. al. proved in [3] that the increasing in the number of features (and con-

sequently in the dimensionality of the data) leads to losing the signiﬁcance of each

feature value. Thus, to avoid decreasing the discrimination accuracy, it is important to

keep the number of features as low as possible, establishing a trade-off between the

discrimination power and the feature vector size.

Aimed at overriding the problems of the semantic gap and the “curse of dimension-

ality”, this paper shows a simple but powerful feature extractor based on multiresolution

wavelet transforms, which uses the approximation subspace to compose the feature vec-

tor to represent the image. The results of applying our method achieves 90% regarding

the precision in the retrieval of medical images that asks up to 65% of the image set.

2 Background - Wavelets

Our proposed technique works on image subspaces generated by applying wavelet

transforms through the multiresolution method. Wavelets are mathematical functions

that separate the signal in different components of frequency, and then examine each

component with a combined resolution with its scale.

It is interesting to compare the wavelet transform to the Fourier transform. While the

Fourier transform analyzes a signal according to the frequency, the wavelet transform

analyzes it according to the scale. Thus, the wavelets can remove statistical redundancy

among pixels, providing a more compact representation of the image information. It is

believed that image indexing generated over the wavelet transformed domain are more

efﬁcient than those designed over the spatial domain. This is due to the fact that the

transformed coefﬁcients have better deﬁned distributions than image pixels. Besides,

the wavelets have a multiresolution property that make it easier to extract the image

features from transformed coefﬁcients [4].

The central element of a multiresolution analysis is a function φ(t), called the scal-

ing function, whose role is to represent a signal at different scales. The translations of

the scaling function constitute the “building blocks” of the representation of a signal at

a given scale. The scale can be increased by dilating (stretching) the scaling function or

decreased by contracting it.

The scaling function φ(t) acts as a sampling function (a basis), in the sense that

the inner product of φ(t) with a signal represents a sort of average value of the signal

over the support (extent) of φ. A recursive application of this process generates new

nested spaces V

j

, that is, ...V

−2

⊂ V

−1

⊂ V

−0

⊂ V

1

⊂ ..., which are the basis of the

multiresolution analysis.

By deﬁnition, a signal in V

−1

can be expressed as a superposition of translations of

the function φ

1

, but since the space V

0

is included in V

−1

, any function in V

0

can also

be expanded in terms of the translations of φ(t). In particular, this is true for the scaling

function itself.

Consequently, there must exist a sequence of numbers h = h

0

, h

1

, . . . such that the

following relationship is satisﬁed :

φ

0

(t) =

X

n

h

n

φ

−1

(t −

n

2

) (1)

Equation 1 is very important and it is known as the scaling equation. Equation 1

describes how the scaling function can be generated by superposing compressed copies

of itself. Now it is possible to deﬁne a new space W

j

as the orthogonal complement of

V

j

in V

j+1

. In other words, W

j

is the space of all functions in W

j

that are orthogonal

to all functions in V

j

under the chosen inner product. The relationship to wavelets is in

the fact that the spaces W

m

are spanned by dilation and translation of a function ψ(t),

thus, such collection of basis functions are called wavelets.

As in the case with the scaling function, since the wavelet ψ(t) belongs to V

−1

, it

can be expressed as a linear combination of φ(t) at scale m = −1, which can be written

as:

ψ(t) =

X

n

g

n

φ

−1

(t − n) (2)

where the sequence g is called the wavelet sequence. In the literature, h and g are known

as the low and high frequency ﬁlters respectively.

Different wavelet bases are obtained by varying the support width of the wavelet. In

general, changes in the wavelet support affect the ﬁnal frequency characteristics of the

wavelet transform. Usually the amplitudes of the coefﬁcients change and, consequently,

the scale, where the signal and noise separate, also changes. The choice of a wavelet

basis still represents an open problem for ﬁltering.

Probably the most popular wavelets are the Daubechies wavelets, because of their

orthogonality and compact support [5]. We choose Symlets, Coifman and Daubechies

wavelets to explore in this work.

3 Proposed Method

Our method deals with two inherent drawbacks of a CBIR system, the high dimension-

ality of feature vectors and the semantic gap. We amend the ﬁrst one by applying higher

resolution on the multiresolution technique, and the second one by characterizing im-

ages through the feature vectors composed of the approximation subspace, which are

obtained through a convolution over each image by the wavelet ﬁlters. We choose the

following wavelet ﬁlters: Coifman (coif 1 and coif2), Symlet (sym2, sym3, sym4,

sym5 and sym15) and Daubechies (db1, db2, db3, db4 and db8)

1

.

Figure 1 graphically summarizes the proposed method. We use 4, 5 and 6 levels

of resolution and the approximation subspace is represented by reading it column to

column, putting the values obtained on the feature vector. Thus, the dimension of the

feature vector is given by multiplying the dimension of the approximation subspace.

That is, to calculate the dimension of the feature vector we just divide by two the width

and the height of the image from each resolution level applied, as is shown in Figure 2,

and multiplying the dimensions of the approximation subspace. The Equation 3 gives

the formula to calculate the number of elements of the proposed feature vector.

#features =

width

2

N

∗

height

2

N

, (3)

where width is the image width, height is the image height and N is the level of decom-

position. Figure 2 shows an example of a wavelet decomposition and the conﬁguration

of regions after decomposition.

Fig.1. Proposed method of feature extraction using 4 levels of decomposition. Each pixel value

of the approximation subspace is put in a feature vector.

For instance, if an image has 256 × 256 pixels, when it is applied 4 levels of reso-

lution, the feauture vector has 256 features, when it uses 5 levels, the feature vector has

64 features. And when it uses 6 levels, the feature vector has 16 features, which is the

total of pixels from the approximation subspace.

4 Experiments and Results

Using the proposed method, we developed a prototype to process k-nearest neighbor

queries (k-NN), answering queries such as: “retrieve the ten most similar images of the

1

These respectively wavelet ﬁlters can be found on the Matlab6.5 tool

(a) (b) (c)

Fig.2. Example of wavelet decomposition (a) original image; (b) image decomposed in two steps

from Haar wavelets; (c) conﬁguration of regions after decomposition.

image MR Head of John Doe”. Figure 3 shows an example of a k-NN query performed

by our prototype. The similarity between two images is expressed by the distance be-

tween their respective feature vectors. We use the well-known Euclidean distance func-

tion (L

2

) to compare the feature vectors.

Query image Thumbnails of the 10 nearest neighbors

2439.jpg

Fig.3. Example of a 10-nearest neighbor query performed by the developed prototype over an

image database of 704 images.

In order to evaluate the effectiveness of the proposed technique we worked on a

variety of medical images categories, and the Precision and Recall (PR) graph [6] was

used as an efﬁcacy measure, since it has been broadly employed to express the re-

trieval efﬁciency of a method. Recall indicates the proportion of relevant images in the

database that has been retrieved when answering a query, and Precision is the portion

of the retrieved images that are relevant for the query. As a rule of thumb, the closer the

PR curve to the top of the graph, the better the technique is. For our experiments, each

PR curve represents the average curve of all the curves obtained by performing a k-NN

query for each image in the whole images set.

We have used the Slim-tree [7] as the indexing structure for the prototype, which is

a metric access method (MAM) specially developed to minimize disk accesses, making

the whole system faster.

4.1 Experiment 1 - The 210 Images Dataset

This dataset consists of 210 medical images classiﬁed in seven categories: Angiogram,

MR (Magnetic Resonance) Axial Pelvis, MR Axial Head, MR Coronal Abdomen, MR

Coronal Head, MR Sagittal Head and MR Sagittal Spine. Each category is represented

by 30 images.

First, we compare the method by using 4 levels of resolution and Coifman (coif 1

and coif 2) , Daubechies (db2 and db8) and Symlet (sym2, sym3, sym4, sym5 and

sym15) wavelets. The use of each one of these wavelet transforms generate a vector

with 256 features. The PR curves of the nine proposed vectors are shown in Figure 4.

Each point of the graph is obtained by the average of 210 queries.

0.75

0.8

0.85

0.9

0.95

1

s

ion

PrecisionXRecall

coif1

coif2

db2

db2

coif1

0.45

0.5

0.55

0.6

0.65

0.7

0 0.2 0.4 0.6 0.8 1

Preci

s

Recall

db8

sym2

sym3

sym4

sym5

sym15

Fig.4. PR curves showing the retrieval behavior of the proposed method using the 4

th

level of

resolution, with 256 features.

Analyzing the graphs of Figure 4, we can see that the wavelet that best represents

the images is the Daubechies db2. We can also consider that the curve generated by

coif1 wavelet practically ties to db2.

For the graphics in Figure 5, we compare the Coifman (coif 1 and coif 2) and

Daubechies (db1, db2 and db8 ) wavelet transforms in the 5

th

level of resolution. Thus,

their feature vectors also have 64 features. Figure 5 shows the PR curves from the

queries on the dataset represented by these feature vectors.

Note that the Haar (or db1) wavelet is the best one to represent these images. As

a basis for comparison, the graphs in Figure 6 also presents the average PR curve ob-

tained by using gray-level histograms over the same image dataset. The results for Haar

(or db1) give the best PR curves shown until now. We can see that all the proposed

methods have better PR curves than Histogram. The curve with better precision is the

one generated by db1, with just 64 features, while the other methods use 256 features,

i.e., there is a reduction of 75% on the data dimensionality. The queries performed by

using the db1 feature vectors taken from the 5

th

level of resolution give precision rates

up to 82,55% regarding the images’ histogram, to queries that ask until 90% of the im-

ages. These methods are well-suited to represent the images under evaluation, since the

precision values are over than 80% for all recall values less than 65%.

0.7

0.8

0.9

1

cision

PrecisionXRecall

db1

0.4

0.5

0.6

0 0.2 0.4 0.6 0.8 1

Pre

Recall

coif1 coif2 db1 db2 db8

Fig.5. PR curves showing the retrieval behav-

ior of the proposed method using the 5

th

level

of resolution, with 64 features.

0.5

0.6

0.7

0.8

0.9

1

e

cision

PrecisionXRecall

db1-5n

0

0.1

0.2

0.3

0.4

0 0.2 0.4 0.6 0.8 1

Pr

e

Recall

coif1Ͳ4n db2Ͳ4n db1Ͳ5n histogram

Fig.6. PR curves showing the retrieval be-

havior of the best curves and gray-level his-

tograms.

Thus, we can conclude that the dimensionality curse really damages the results,

because the irrelevant features disturb the inﬂuence of the relevant ones. Moreover, the

application of wavelet transform in 5 levels through the multiresolution method reduced

the redundancy of information from data, and it also well represents the images for

executing similarity queries.

4.2 Experiment 2 - The 704 Image Dataset

A larger image dataset, with 704 MR images, which is classiﬁed in eight categories was

used herein. The number of the images in the dataset regarding each category is: An-

giogram (36), MR Axial Pelvis (86), MR Axial Head (155), MR Sagittal Head (258),

MR Coronal Abdomen (23), MR Sagittal Spine (59), MR Axial Abdomen (51) and MR

Coronal Head (36). In the previous experiment the best result was obtained applying

a Daubechies wavelet transform, and, according to Wang [8], the Daubechies wavelet

achieves excellent results in image processing due to its properties. Thus, we use sev-

eral wavelets of the family of Daubechies on 4, 5 and 6 levels of resolution by the

multiresolution method.

Figure 7 shows the Precision vs. Recall generated by the proposed method. First,

it is displayed the wavelet name, then the level of resolution and ﬁnally the number

of elements of the feature vector. Observe that the PR curves generated by the same

wavelet transform in several levels of resolution decrease according to the wavelet cho-

sen. The bigger the number of the ﬁlters, the faster the curves decrease when a higher

level of resolution is chosen. Analyzing the db1 wavelet, which has two ﬁlters, observe

that the curves generated in 4 and 5 levels of resolution are equivalent, even exiting a

large difference between the number of features from their vectors. For a 6 level of res-

olution we still have an excellent result (see db1-6n-16 curve), as with just 16 features,

the precision is over than 80% for values of recall until 90%. And comparing the 256

features with the 16 ones, we have a dimensionality reduction of 93.75%. Also note that

the larger the number of ﬁlters, the smaller the precision of the queries, considering the

same level of resolution.

0 95

1

PrecisionXRecall

db1-4n

0 85

0.9

0

.

95

db1Ͳ4n

db1Ͳ5n

db1

6

0.75

0.8

0

.

85

ion

db1

Ͳ

6

n

db2Ͳ4n

db2

5n

0.65

0.7

0.75

Precis

db2

Ͳ

5n

db2Ͳ6n

db3Ͳ4n

0.55

0.6

db3Ͳ5n

db4Ͳ4n

0.45

0.5

db4Ͳ5n

db8Ͳ4n

0 0.2 0.4 0.6 0.8 1

Recall

db8Ͳ5n

Fig.7. PR curves generated by several Daubechies wavelets transforms using the 4

th

, 5

t

h and

6th level of multiresolution.

Figure 8 shows the best curves of PR with 256, 64 and 16 features, respectively,

from the Figure 7 and compare them with the curve given by the gray-level histogram.

Visually, all three methods have a better performance than the histogram. Numerically,

we get an improvement of precision until 531% to values of recall until 95%, for the

feature vector with 256 features. For 64 features, the improvement in precision is up

to 528% to a recall of 95%; and for 16 features, the improvement in precision is up to

491% to a recall of 90%.

0.5

0.6

0.7

0.8

0.9

1

e

cision

PrecisionXRecall

0

0.1

0.2

0.3

0.4

0 0.2 0.4 0.6 0.8 1

Pr

e

Recall

db1Ͳ4n db1Ͳ5n db1Ͳ6n histogram

Fig.8. PR curves showing the retrieval behavior of the best curves and gray-level histogram.

To compare this method with another one from literature, we used a technique pro-

posed by Balan [9], which employs an improved version of the EM/MPM method to

segment images, and for each region segmented based on texture, six features were ex-

tracted: the mass (m); the centroid (xo and yo); the average gray level (µ), the Fractal

dimension (D); and the linear coefﬁcient used to estimate D (b). Therefore, when an

image is segmented in L classes, the feature vector has L * 6 elements. Here we use

L = 5, so the feature vector has 30 features. Figure 9 illustrates the feature vector

described.

D

1

x0

1

y0

1

m

1

µ

1

b

1

... D

L

x0

L

y0

L

m

L

µ

L

b

L

Featuresofthetextureclass1 FeaturesofthetextureclassL

Fig.9. The feature vector.

Figure 10 shows the comparison of the curves generated by our method with 16

features (db1-6n-16 curve) to the method that uses an improved version of EM/MPM

algorithm, which is one of the best methods in the literature. We can see that our method

performs better when processing similarity queries (k-NN). Note also that our method

demands fewer features than EM/MPM. We can also compare the time spending to im-

age processing and extraction of the features. While the improved version of EM/MPM

spends around 17.05 seconds per image, our method with 16 features spends around

0.77 seconds.

0

0.6

0.7

0.8

0.9

1

s

ion

PrecisionXRecall

db1-6n

0

0.1

0.2

0.3

0.4

0

.5

0 0.2 0.4 0.6 0.8 1

Preci

s

Recall

30features

16features

Fig.10. PR curves from db1-6n with 16 features and from improved version of the EM/MPM.

5 Conclusions

In this paper we presented a new technique based on wavelet-approximation subspaces,

which was used to compose the image feature vector to process similarity queries on

the image content. A tool based on the presented technique was implemented, aimed at

validating the technique proposed on real images from different tissues of the human

body, and to assist the study and analysis of medical images. Thus, the method can be

included in a PACS under development in our institution.

Several wavelets were evaluated, and the Daubechies showed a better efﬁcacy than

the other ones for the analyzed image sets. The achieved results showed that the pro-

posed method performs very well, presenting an image retrieval accuracy always over

90% for recall values smaller than 65%. Moreover, we obtained a feature vector with

just 16 elements that provided a better performance than the feature vector with 30 fea-

tures obtained from segmented images by using an improvement version of EM/MPM

algorithm, which is a much more time consuming method. By the results obtained in

the work, we can claim that wavelets and the multiresolution method are well suited to

deal with the issue of the semantic gap and the dimensionality of feature vectors.

References

1. Deserno, T.M., Antani, S., Long, L.R.: Gaps in content-based image retrieval. In: Medical

Imaging 2007: PACS and Imaging Informatics. Volume 6516 of SPIE. (2007)

2. M¨uller, H., Michoux, N., Bandon, D., Geissbuhler, A.: A review of content-based image

retrieval systems in medical applications-clinical beneﬁts and future directions. International

Journal of Medical Informatics 73 (2004) 1–23

3. Beyer, K., Godstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful?

In: ICDT. Volume 1540 of Lecture Notes in Computer Science. (1999) 217–235

4. Stollnitz, E.J., DeRose, T.D., Salesin, D.H.: Wavelets for Computer Graphics - Theory and

Applications. (1996)

5. Daubechies, I.: Ten Lectures on Wavelets. Volume 61. (1992)

6. Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval. (1999)

7. Traina Jr., C., Traina, A.J.M., Seeger, B., Faloutsos, C.: Slim-trees: High performance metric

trees miniminzing overlap between nodes. Research Paper CMU-CS-99-170 (1999)

8. Wang, J.Z.: Wavelets and imaging informatics: A review of the literature. Journal of Biomed-

ical Informatics 34 (2001) 129–141

9. Balan, A.G.R., Traina, A.J.M., Traina Jr., C., Marques, P.M.d.A.: Fractal analysis of image

textures for indexing and retrieval by content. In: 18th IEEE CBMS. (2005) 581–586