A NOVEL RELEVANCE FEEDBACK PROCEDURE BASED ON

LOGISTIC REGRESSION AND OWA OPERATOR FOR

CONTENT-BASED IMAGE RETRIEVAL SYSTEM

P. Zuccarello, E. de Ves

Departamento de Informtica, University de Valencia, Avda. Vicente Andr

´

es Estell

´

es, 1. 46100-Burjasot, Valencia, Spain

T. Leon, G. Ayala

Department of Statistics and Operations Research, University of Valencia, Valencia, Spain

J. Domingo

Institut of Robotics, University of Valencia, Valencia, Spain

Keywords:

Visual information retrieval,relevance feedback,logistic regresion.

Abstract:

This paper presents a new algorithm for content based retrieval systems in large databases. The objective of

these systems is to ﬁnd the images which are as similar as possible to a user query from those contained in the

global image database without using textual annotations attached to the images. The procedure proposed here

to address this problem is based on logistic regression model: the algorithm considers the probability of an

image to belong to the set of those desired by the user. In this work a relevance proabaility π(I) is a quantity

wich reﬂects the estimate of the relevance of the image I with respect to the user’s preferences. The problem of

the small sample size with respect to the number of features is solved by adjusting several partial linear models

and combining its relevance probabilitis by means of an ordered averaged weighted operator. Experimental

results are shown to evaluate the method on a large image database in term of the average number of iterations

needed to ﬁnd a target image.

1 INTRODUCTION

The increasing amount of information available in to-

days world raises the need to retrieve relevant data

efﬁciently. Unlike text-based retrieval, where key

words are successfully used to index documents,

content-based image retrieval poses up-front the fun-

damental questions of how to extract useful image

features and how to use them for intuitive retrieval

(Smeulders et al., 2000). The main drawback of tex-

tual image retrieval systems, that is, the annotator de-

pendency, would be overcome in pure CBIR systems.

Image features are a key aspect of any CBIR sys-

tem. A general classiﬁcation can be made: low level

features (color, texture and shape) and high level fea-

tures (usually obtained by combining low level fea-

tures in a reasonably predeﬁned model). High level

features have a strong dependency on the application

domain, therefore they are not usually suitable for

general purpose systems. This is the reason why one

of the most important and developed research activi-

ties in this ﬁeld has been the extraction of good low

level image descriptors. Obviously, there is an impor-

tant gap between these features and human perception

(a semantic gap). For this reason, different methods

(mostly iterative procedures) have been proposed to

deal with the semantic gap (Rui et al., 1998). In most

cases the idea underlying these methods is to integrate

the information provided by the user into the decision

process. This way, the user is in charge of guiding

the search by indicating his/her preferences, desires

and requirements to the system. The basic idea is

rather simple: the system displays a set of images

(resulting from a previous search); the user selects

the images that are relevant (desired images) and re-

jects those which are not (images to avoid) according

to his/her particular criterion; the system then learns

from these training examples to achieve an improved

performance in the next run. The process goes on it-

eratively until the user is satisﬁed. This kind of proce-

dures are called relevance feedback algorithms (Zhou

and Huang, 2003), (de Ves et al., 2006).

A query can be seen as an expression of an infor-

mation need to be satisﬁed. Any CBIR system aims

167

Zuccarello P., de Ves E., Leon T., Ayala G. and Domingo J. (2007).

A NOVEL RELEVANCE FEEDBACK PROCEDURE BASED ON LOGISTIC REGRESSION AND OWA OPERATOR FOR CONTENT-BASED IMAGE

RETRIEVAL SYSTEM.

In Proceedings of the Second International Conference on Computer Vision Theory and Applications - IU/MTSV, pages 167-172

Copyright

c

SciTePress

at ﬁnding images relevant to a query and thus to the

information need expressed by the query. The rela-

tionship between any image in the database and a par-

ticular query can be expressed by a relevance value.

This relevance value relies on the user-perceived sat-

isfaction of his/her information need. The relevance

value can be interpreted as a mathematical probabil-

ity (a relevance probability). The notion of relevance

probability is not unique because different interpre-

tations have been given by different authors. In this

paper a relevance probability π(I) is a quantity which

reﬂects the estimation of the relevance of the image I

with respect to the user’s information needs. Initially,

every image in the database is equally likely, but as

more information on the user’s preferences becomes

available, the probability measure concentrates on a

subset of the database. The iterative relevance feed-

back scheme proposed in the present paper is based

on logistic regression analysis for ranking a set of im-

ages in decreasing order of their evaluated relevance

probabilities.

Logistic regression is based on the construction

of a linear model whose inputs, in our case, will be

the image characteristics extracted from a certain im-

age I and whose output is a function of the relevance

probability of the image in the query π(I). In logis-

tic regression analysis, one of the key features to be

established is the order of the model to be adjusted.

The order of the model must be in accordance with

the reasonable amount of feedback images requested

from the user. For example, it is not reasonable for

the user to select 40 images in each iteration; a feed-

back of 5/10 images would be acceptable. This re-

quirement leads us to group the image features into n

smaller subsets. The outcome of this strategy is that

n smaller regression models must be adjusted: each

sub-model will produce a different relevance proba-

bility π

k

(I) (k = 1 . . . n). We then face to the ques-

tion of how to combine the π

k

(I) in order to rank the

database according to the user’s preferences. OWA

(ordered weighted averaging) operators which were

introduced by Yager in 1988 (Yager, 1988) provides

a consistent and versatile way of aggregating multiple

inputs into one single output.

Section 2 explains the logistic regression approach

to the problem. Next, in section 3 the aggregation op-

erators used in our work are introduced. Section 4

describes the low level features extracted from the im-

ages and used to retrieve them. An crucial part of this

work, the proposed algorithm, is described in detail in

section 5. After that, in section 6 we present experi-

mental results which evaluate the performance of our

technique using real-world data. Finally, in section 7

we extract conclusions and point to further work.

2 LOGISTIC REGRESSION

MODEL

At each iteration, a sample is evaluated by the user

selecting two sets of images: the examples or posi-

tive images and the counter-examples or negative im-

ages. Let us consider the (random) variable Y giving

the user evaluation where Y = 1 means that the image

is positively evaluated and Y = 0 means a negative

evaluation.

Each image in the database has been previously

described by using low level features in such a way

that the j-th image has the k-dimensional feature vec-

tor x

j

associated. Our data will consist of (x

j

, y

j

),

with j = 1, . . . , k where x

j

is the feature vector and y

j

the user evaluation (1= positive and 0= negative). The

image feature vector x is known for any image and

we intend to predict the associated value of Y . The

natural framework for this problem is the generalized

linear model. In this paper, we have used a logistic

regression where P(Y = 1 | x) i.e. the probability that

Y = 1 (the user evaluates the image positively) given

the feature vector x, is related with the systematic part

of the model (a linear combination of the feature vec-

tor) by means of the logit function. Generalized lin-

ear models (GLMs) extend ordinary regression mod-

els to encompass non-normal response distributions

and modeling functions of the mean. Most statisti-

cal software has the facility to ﬁt GLMs. Logistic

regression is the most important model for categor-

ical response data. Logistic regression models are

also called logit models. They have been successfully

used in many different areas including business appli-

cations and genetics. For a binary response variable

Y and p explanatory variables X

1

, . . . , X

p

, the model

for π(x) = P(Y = 1 | x) at values x = (x

1

, . . . , x

p

) of

predictors is

logit[π(x)] = α + β

1

x

1

+ . . . + β

p

x

p

(1)

where logit[π(x)] = ln

π(x)

1−π(x)

. The model can also be

stated directly specifying π(x) as

π(x) =

exp(α + β

1

x

1

+ . . . + β

p

x

p

)

1 + exp(α + β

1

x

1

+ . . . + β

p

x

p

)

. (2)

The parameter β

i

refers to the effect of x

i

on the log

odds that Y = 1, controlling the other x

j

. The model

parameters are obtained by maximizing the likelihood

equations.

In the ﬁrst steps of the procedure, we have a major

difﬁculty when having to adjust a global regression

model in which we take the whole set of variables into

account, because the number of images (the number

of positive plus negative images chosen by the user)

VISAPP 2007 - International Conference on Computer Vision Theory and Applications

168

is typically smaller than the number of characteris-

tics. In this case, the regression model adjusted has as

many parameters as the number of datum and many

relevant variables could be not considered. On the

other hand it is not realistic to ask the user to make a

great number of positive and negative selections from

the very beginning; therefore we think that the dif-

ﬁculty cannot be avoided in this way. In order to

solve this problem, our proposal is to adjust different

smaller regression models: each model considers only

a subset of variables consisting of semantically re-

lated characteristics of the image. Consequently, each

sub-model will associate a different relevance prob-

ability to a given image x, and we face the question

of how to combine them in order to rank the database

according to the user’s preferences. We can see this

question as an information fusion problem.

3 AGGREGATING THE

RELEVANCE PROBABILITIES

Let us denote as π

1

(x), π

2

(x), . . . , π

n

(x) the different

relevance probabilities associated with a given image

x. Each one of them has been obtained separately

by using different regression models and we need

to associate a ﬁnal probability π(x) by aggregating

the information provided by each π

j

(x), ( j = 1 . . . n).

Mathematical aggregation operators transform a ﬁ-

nite number of inputs into a single output and play

an important role in image retrieval. In (Stejic et al.,

2005)the authors compare the effect of 67 operators

applied to the problem of computing the overall im-

age similarity, given a collection of individual fea-

ture similarities. Their results show how important

for retrieval performance the choice of the aggrega-

tion operator is. We have not used any of the 67

operators reviewed. Instead, we decided to use the

so-called ordered weighted averaged (OWA) opera-

tors (Yager, 1988) since then they have been success-

fully applied in different areas such as decision mak-

ing, expert systems, neural networks, fuzzy systems

and control, etc. An OWA operator of dimension n is

a mapping f : ℜ

n

→ ℜ with an associated weighting

vector W = (w

1

, . . . , w

n

) such that

∑

n

j=1

w

j

= 1 and

where f (a

1

, . . . , a

n

) =

∑

n

j=1

w

j

b

j

where b

j

is the j-th

largest element of the collection of aggregated objects

a

1

, . . . , a

n

. The particular cases shown in table 1 can

better illustrate the idea underlying OWA operators.

Notice that no weight is associated with any par-

ticular input; instead, the relative magnitude of the in-

put decides which weight corresponds to each input.

In our application, the inputs are relevance probabil-

ities and this property is very interesting because we

Table 1: Illustrating examples of OWA aggregation values.

W f (a

1

, . . . , a

n

)

(1, 0, . . . , 0) max

i

a

i

(0, 0, . . . , 1) min

i

a

i

(

1

n

,

1

n

, . . . ,

1

n

)

1

n

∑

n

j=i

a

i

.

do not know, a priori, which set of visual descriptors

will provide us with the best information.

As OWA operators are bounded by the max and

min operators, Yager introduced a measure called or-

ness to characterize the degree to which the aggrega-

tion is like an or (max) operation:

orness(W ) =

1

n − 1

n

∑

i=1

(n − i)w

i

. (3)

This author also introduced the concept of disper-

sion or entropy associated with a weighting vector:

Disp(W ) =

n

∑

i=1

w

i

lnw

i

. (4)

Disp(W ) tries to reﬂect how much of the information

in the arguments is used during an aggregation based

on W .

Clearly, the vector of weights W can be pre-ﬁxed,

but a number of approaches have also been sug-

gested for determining it according to different cri-

teria. One of the ﬁrst methods developed was pro-

posed by O’Hagan (O’Hagan, 1988). It provides us

with the vector of weights for a given level of orness

(optimism) which maximizes their entropy:

W = argmax

n

∑

i=1

w

i

lnw

i

subject to

α =

1

n−1

∑

n

i=1

(n − i)w

i

,

∑

n

i=1

w

i

= 1, w

i

∈ [0, 1].

This problem is not computationally easy to solve.

Fuller and Majlender (Fuller and Majlender, 2003)

have obtained the analytical expression of the maxi-

mum entropy weights.

Figure 1 shows the aggregation of weights for

n = 10 obtained with the above-mentioned method

for orness value α ∈ [0.3, 0.7]. In this work, the ag-

gregation weights have been computed by using this

method.

4 VISUAL FEATURES

This section deals with the low level features the sys-

tem uses for predicting human judgment of image

A NOVEL RELEVANCE FEEDBACK PROCEDURE BASED ON LOGISTIC REGRESSION AND OWA OPERATOR

FOR CONTENT-BASED IMAGE RETRIEVAL SYSTEM

169

is precisely to capture that notion of similarity that

each user has, which can also change between differ-

ent queries. Consequently, the valid criterion of sim-

ilarity appears to be the user’s opinion. This would

have introduced an external variable into the experi-

ment that would have masked the main goal: an ob-

jective evaluation of the system as such. That is why

we have chosen to use an approach in which a given

image has to be found. The search is considered suc-

cessful if the image is ranked within the ﬁrst 16. This

number is arbitrary but we have checked that 16 im-

ages shown side by side is a reasonable number to

localize a particular one at a ﬁrst sight.

Once the criterion for termination has been

adopted, the experiment will be designed by showing

several images to the user; a choice of 6 images (the

same for all users) was selected from a database of

about 4700. These images are classiﬁed as belonging

to different themes such as ﬂowers, horses, paintings,

skies, textures, ceramic tiles, buildings, clouds, trees,

etc. even though the category is not used at all during

the search. The 6 target images are in our experience,

representative of different themes and levels of difﬁ-

culty. They are displayed in ﬁgure 2.

Figure 2: Target images used in experiments.

For each target image the search proceeds itera-

tively. In each iteration the user has to select some rel-

evant images (similar to the target according to his/her

judgment) and others signiﬁcantly different from the

target. The number of images of each type is left to

the user, although two conditions must be fulﬁlled: at

least one relevant and one irrelevant images must be

selected and the total number of selections has to be

greater than 4. The algorithm proceeds as explained

in previous sections and the images are ranked. If the

target appears in the ﬁrst 16, it is considered to have

been found; otherwise the user can move backwards

or forwards to see more images in rank order and a

new iteration of choosing/search/showing begins.

To ensure that the experiments are not biased, the

query tasks were performed by a group of 40 users

who had not been involved in the design and devel-

opment of the system and had no knowledge of the

content of the database or of the retrieval features and

Table 2: Average, maximun, minimun iteration number to

ﬁnd a target image.

Image It. Av. max min

Car 5.17(2.95) 12 1

Flower 4.17 (3.20) 17 1

Butterﬂy 4.71 (3.70) 19 1

ﬁrework 2.14 (1.81) 9 1

Miro 3.67 (1.55) 8 2

Glass 3.42 (1.52) 6 1

All 3.88(1.07) 19 1

methods used (untrained users).

Table 2 shows the average and standard deviation

of the number of iterations needed to ﬁnd images by

these untrained users. The last row shows the aver-

age for all images and users. The experiments exhibit

good performance in ﬁnding a target image (3.88 iter-

ations in average) in the used database.

7 CONCLUSION

This paper addresses the problem of image retrieval

by means of an algorithm based on logistic regression.

The main advantage of the method is the facility of

incorporating the feedback of the user. Its main draw-

back is the lack of sufﬁcient information (too small

sample) to ﬁt the model, since the number of inputs

(image features) is usually high. This has been ad-

dressed by means of partial models that get the output

from each subset of the inputs. The problem of com-

bining the information of the different models, which

is a data fusion problem, is solved by using an ordered

weighted averaging (OWA) operator.

Concerning the experimental results, the average

number of iterations shown in 2 exhibits good perfor-

mance of the procedure. Some further experimenta-

tion and results analysis is currently being carried out

by our research group, where users are grouped and

classiﬁed with regard to there interaction of the itera-

tive process of image selection.

REFERENCES

de Ves, E., Domingo, J., Ayala, G., and Zuccarello, P.

(2006). A novel bayesian framework for relevance

feedback in image content-based retrieval systems.

Pattern Recognition, 39:1622–1632.

Fuller, R. and Majlender, P. (2003). On obtaining minimal

variability owa operator weights. Fuzzy Sets and Sys-

tems, 136:203–215.

O’Hagan, M. (1988). Aggregating template or rule an-

tecedents in real-time expert systems with fuzzy set

A NOVEL RELEVANCE FEEDBACK PROCEDURE BASED ON LOGISTIC REGRESSION AND OWA OPERATOR

FOR CONTENT-BASED IMAGE RETRIEVAL SYSTEM

171

logic. In Proc. of 22nd Annu. IEEE Asilomar Conf. on

Signals, pages 681–689, Paciﬁc Grove, CA.

Rui, Y., Huang, S., Ortega, M., and Mehrotra, S. (1998).

Relevance feeback: a power tool for interactive

content-based image retrieval. IEEE Transaction on

circuits and video technology, 8(5).

Smeulders, A., Santini, S., Gupta, A., and Jain, R. (2000).

Content-based image retrieval at the end of the early

years. IEEE transactions on Pattern Analysis and Ma-

chine Intellingence, 22(12):1349–1379.

Stejic, Z., Takama, Y., and Hirota, K. (2005). Mathemati-

cal aggregation operators in image retrieval: effect on

retrieval performance and role in relevance feedback.

Signal processing, 85:1297–324.

Yager, R. (1988). On ordered weighted averaging aggrega-

tion operators in multi-criteria decision making. IEEE

Trans. Systems Man Cybernet, 18:183–190.

Zhou, X. and Huang, T. (2003). Relevance feedback for

image retrieval: a comprehensive review. Multimedia

systems, 8(6):536–544.

VISAPP 2007 - International Conference on Computer Vision Theory and Applications

172