SerfSIN: Search Engines Results’ Refinement using a Sense-driven

Inference Network

Christos Makris

, Yannis Plegas

, Giannis Tzimas

and Emmanouil Viennas

Department of Computer Engineering and Informatics, University of Patras, Patras, Greece

Department of Applied Informatics in Management & Economy, Faculty of Management and Economics, Technological

Educational Institute of Messolonghi, Messolonghi, Greece

Keywords: Search Engines, Semantic-enhanced Web applications, Re-ranking Model, WordNet Senses, Inference

Network.

Abstract: Α novel framework is presented for performing re-ranking in the search results of a Web search engine,

incorporating user judgments as registered in their selection of relevant documents. The proposed scheme

combines smoothly techniques from the area of Inference Networks with text processing techniques

exploiting semantic information, and is instantiated to a fully functional prototype at present leading to a re-

ranking whose quality outperforms significantly the initial ranking. The innovative idea is the use of a

probabilistic network based to the senses of the documents. When the user selects a document, the belief of

the network to the senses of the selected document is raised up and the documents that contain these senses

are ranked higher. Also we present an implemented prototype that supports three different Web search

engines (and it can be extended to support many more), while extensive experiments in the ClueWeb09

dataset using the TREC’s 2009, 2010 and 2011 Web Tracks’ data depict the improvement in search

performance that the proposed approach attains.

1 INTRODUCTION

The global availability of information provided by

the World Wide Web in the past few decades has

made people’s lifes easier in terms of time saving

and accuracy in information seeking. Commercial

search engines have provided the necessary tools to

the average Internet user to search for information

about any topic he/she might be interested in and

their everyday use rises constantly.

However there are still circumstances where one

finds himself/herself wondering around the

information maze posed to him/her by the Web. For

example a user might be interested in “rockets” and

missiles, but get highly ranked results after

performing a query in a search engine about the

famous NBA basketball team, or interested in

animals and specifically in “jaguars”, but get results

about cars. Polysemy can be clarified by the general

context of one’s speaking, but search engines do not

provide the necessary functionality to address this

problem. Moreover, it often happens for the first set

of results returned by a search engine to contain

irrelevant information.

To overcome these drawbacks, in this paper we

propose a new technique aiming to provide the

necessary tools for the refinement of search results,

taking into account feedback provided by the user.

Besides the keywords provided from the initial

query and the set of choices the user makes, we also

utilize the semantic information hidden inside the

returned pages. We feed this enriched combination

of information in a probabilistic model and re-rank

the results, without having to gather or import any

additional data, thus making the whole concept

simple and efficient.

Overall in this work, we describe the SerfSIN

system (Search Engines results ReFinement using a

Sense-driven Inference Network), that uses a re-

ranking model based on inference networks and

enhances it, in order to form an effective system for

efficient reorganization of search results based on

user choices. Moreover, we utilize the WordNet

knowledge base in order to clarify the various senses

that the query terms might carry and thus enrich our

model with semantic information. Our technique is

not restricted to WordNet, but it can also be

extended to support other knowledge bases, such as

222

Makris C., Plegas Y., Tzimas G. and Viennas E..

SerfSIN: Search Engines Results’ Reﬁnement using a Sense-driven Inference Network.

DOI: 10.5220/0004365502220232

In Proceedings of the 9th International Conference on Web Information Systems and Technologies (WEBIST-2013), pages 222-232

ISBN: 978-989-8565-54-9

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

YAGO (Suchanek et al., 2007) and BabelNet

(Navigli and Ponzetto, 2010).

The main idea of the paper is to transfer the

belief of the user to the selected documents through

the constructed network to the other documents that

contain the senses of the selected documents. The re-

ranking of the results is based on a vector that

contains a weight for each document that represents

the probability of the document to be relative for the

user. We construct a probabilistic network from the

terms and the senses of the documents so when the

users select a document, the weights of the

documents that contain these senses are taking

bigger values and so they are ranked higher.

Detailed experiments depict the superiority of the

proposed system in comparison to the initial ranking

and to previously relevant proposed techniques.

The rest of the paper is organized as follows.

Section 2 overviews related literature and section 3

highlights the overall architecture. Next, in section 4

we present the re-ranking process, whereas the

proposed approach is analytically covered in section

5. Section 6 depicts the results of our experiments.

Finally, section 7 concludes the paper and provides

future steps.

2 LITERATURE OVERVIEW

The last decade has been characterized by

tremendous efforts of the research community to

overcome the problem of effective searching in the

vast information dispersed in the World Wide Web

(Baeza-Yates and Ribeiro-Neto, 2011). A standard

approach to Web searching is to model documents

as bags of words, and a handful of theoretical

models such as the well-known vector space model,

have been developed employing this representation.

An interesting alternative is to model documents as

probabilistic networks (graphs), whose vertices

represent terms, documents and user queries and

whose edges represent relations between the

involved entities defined on the basis of any

meaningful statistical or linguistic relationships.

Many works explore the usefulness of such graph-

based text representations for IR like (Blanco and

Lioma, 2012), (Boccaletti et al. 2006) and Bayesian

Networks are prominent in them.

Bayesian Networks (Niedermayer, 2008) are

increasingly being used in a variety of application

areas like searching (Teevan, 2011), (Acid et al.,

2003), (Callan, 2009), Bioinformatics (Ahmed et al.,

2012), and many others. An important subclass of

Bayesian Networks is the Bayesian Inference

Networks (BIN) (Turtle, 1991) that have been

employed in various applications (Teevan, 2011),

(Ma et al., 2006), (Abdo et al., 2011). Moreover,

BINs form a major component in the search engine

Indri’s retrieval model (Metzler et al., 2005).

In this work we introduce a semantically driven

Bayesian Inference Network, incorporating semantic

concepts in order to improve the ranking quality of

search engines’ results. Related approaches were

presented in (Lee et al., 2011), (Abdo et al., 2011).

The authors in (Lee et al., 2011) enrich the

semantics of user-specific information and

documents targeting at efficient implementation of

personalized searching strategies. They adopt a

Bayesian Belief Network (BBN) as a strategy for

personalized search since they provide a clear

formalism for embedding semantic concepts. Their

approach is different to ours, since they use belief

instead of inference networks and they employ the

Open Directory Project Web directory instead of

WordNet. In (Abdo et al., 2011) the authors enhance

the BINs using relevance feedback information and

multiple reference structures and they apply their

technique to similarity-based virtual screening,

employing two distinct methods for carrying out

BIN searching: reweighting the fragments in the

reference structures and a group fusion algorithm.

Our approach aims at a different application and

employs semantic information, as a distinct layer in

the applied inference network.

On the other hand and concerning search result’s

re-ranking, an interesting approach is to exploit

information from past user queries and preferences.

The relevant techniques range from simple systems

implementing strategies that match users’ queries to

collections results (Meng et al., 2002), (Howe and

Dreilinger, 1997) to the employment of the machine

learning machinery exploiting the outcomes of

stored queries, in order to permit more accurate

rankings, the so called “learning to rank” techniques

(Liu, 2011). There is also related but different to our

focus work (Brandt et al., 2011) combining

diversified and interactive retrieval under the

umbrella of dynamic ranked retrieval.

The main novelty of our work centers in the

transparent embedding of semantic knowledge bases

to improve search engine results re-ranking. Also in

order to achieve our purpose we create a new

probabilistic model which takes as input different

semantic knowledge bases. The most relevant to our

work is the system presented in (Antoniou et al.,

2012) embedding instead of Bayesian inference

network, techniques based on the exploitation of

semantic relations and text coverage between results.

SerfSIN:SearchEnginesResults'RefinementusingaSense-drivenInferenceNetwork

223

Our work outperforms (Antoniou et al., 2012)

improving its search performance, while it is based

on a solid theoretical framework.

3 THE PROPOSED SYSTEM

In Figure 1 a high level overview of the overall

system architecture is depicted. As shown the

system is decomposed into the following core

subsystems:

1. The SerfSIN Web Interface that interacts with

the end users in order to provide searching and

search refinement services.

2. The Search API Modules which are responsible

for the communication with various search engines’

APIs, in order to retrieve the relevant results. At

present three search engines are supported, namely:

 Google Search API, with two options provided to

the user for the first set of retrieved results; the use

of the Google deprecated API and the option of

parsing the pages of the Google results.

 Bing Search API.

 Indri Search Machine (Strohman et al., 2005) over

the ClueWeb09 Dataset (Callan, 2009).

3. The Page Crawler that fetches the content of the

search engine results after the end user poses his first

query.

4. The HTML Decomposer that parses the HTML

code of a page and exports useful data, such as the

title, keywords, metatags, highlighted text etc.

5. The Sense Interference Network (SIN)

Constructor that creates the Network from the

previous steps and reorganizes the search engines

results.

Figure 1: SerfSIN High Level System Architecture.

The subsystems work in a sequential manner and the

first three interact with the users and the various

Web sources.

In order for a user to utilize the services provided

by SerfSIN, he/she simply makes a query after

setting the basic search parameters (explained in

section 6) in the SerfSIN Web Interface. Every time

the user clicks on a result, the core module is utilized

producing the new improved ranking. In particular,

the core subsystems work real-time and in parallel,

rearranging the initial order and the algorithms

utilized are presented in detail in the following

sections.

4 THE RE-RANKING PROCESS

The whole process is initialized by the initial query

performed by the user through the SerfSIN Web

Interface. The query keywords are imported to the

search engine selected (Google, Bing or Indri)

through the relevant API and the results are

collected by the Page Crawler. After this step, the

initial results’ ranking is the same as the one given

by the search engine selected.

The next step improves the initial ranking based

on the user’s selections; the SIN Constructor is

utilized accompanied by the input produced by the

HTML Decomposer. This step runs iteratively every

time the user makes a selection. In particular the

proposed network is utilized, either as a standalone

re-ranking algorithm, or in combination with the

initial ranking returned by the search engine, or with

the previous ranking of the results in the re-ranking

process (if we have a series of re-rankings). In all

cases we resolve ties, by following the previous

ranking. The above options are expressed by the

following two equations, for the ranking score of

document d

  







  





(1)





  1



∗1∗



(2)

where R

denotes the re-ranking weight provided by

the network for d

(its computation is described in

the next section), previous_rank(i) stands for the

previous rank position of d

, n is the number of

results retrieved and β is a user defined weight

factor. Intuitively, when the factor β is raised the re-

ranking process results in major rank changes. The

equation



_1



introduces a

factor based on the previous ranking of a result in

the list. Equation (1) is used in the case of the

network use as a standalone re-ranking system,

while equation (2) is used for the composite case

where the new ranking system is composed with the

previous ranking of search results. In the new

HTML Decomposer

SIN Constructor

Page Crawler

SerfSIN Subsystems

SerfSIN Web Interface

Google

Bing

Indri

Search API Modules

GoogleGWeb

WEBIST2013-9thInternationalConferenceonWebInformationSystemsandTechnologies

224

ranking produced, the results are ranked according

to the above calculated score. When the user selects

further results, the same procedure is followed with

the difference that the ranking produced by the

previous phase is used as input for the next

reordering.

Even though we assume that most results selected by

a user are relevant, our scheme incorporates

smoothly the previous ranking, hence it is robust to

user misselections. A misselection of a result leads

to the inclusion of its relevant information to the

ranking process, but still can be made to not affect

significantly the produced ranking.

5 RE-RANKING WEIGHT

CALCULATION

Our extension of inference network, the Sense

Inference Network (SIN), as depicted in

Figure 2,

consists of four component levels: the document

level, the term level, the sense level, and a fourth

level that represents the documents nodes and the

value they take in order to re-rank the results; the

fourth level can be considered to play the role of the

query layer in the traditional inference network

model and its presence signifies that we are not

interested to model specific information needs, but

re-rankings based on users’ reaction. The SIN is

built once for the retrieved documents of the query’s

results and its structure does not change during re-

ranking. The document level contains a node (



’s)

for each document of the query’s results. For each

term of the documents nodes texts, we add terms

nodes (



‘s) to the network and we interconnect the

documents nodes with the terms nodes with arcs.

The terms are induced by the retrieved pages by

applying to them sequentially: (i) HTML stripping,

(ii) removal of the stop words and finally, (iii)

stemming using the Porter stemming algorithm

(Porter, 1980). For each term node we also find its

different senses using WordNet (Howe and

Dreilinger, 1997) and we add them to the network.

WordNet is a lexical database for the English

language. It groups English words into sets of

synonyms called synsets, provides short, general

definitions, and records the various semantic

relations between these synonym sets. The term

nodes are connected through arcs to their different

senses nodes (



’s). Finally, all the senses nodes

which are contained in a document are connected to

the respective document node at the last level. The

formed network is a four level directed graph in

which the information flows from the documents

nodes of the first level to the term nodes and then

through the senses nodes to the documents nodes of

the last level.

The innovative idea in our network is the

existence of the level of senses (concepts) based on

the WordNet knowledge base. The term nodes are

connected to their different senses through directed

arcs. The existence of a directed path between a

document node and a sense node denotes that this

sense is appearing in the respective document and

more formally to the event that a sense has been

observed in the documents collection. Similarly, a

single sense node might be shared by more than one

term. The dependence of a sense node upon the

respective term node is represented directly in the

network through an arc.

The final process for the construction of SIN is

the creation of the arcs from the senses nodes to the

document nodes at the last level. The last level's

nodes are different from the nodes of the first layer;

they represent the same entities (documents) but in

different time instances (we depict this fact by

drawing the arcs from the sense nodes to the nodes

at the last layer as dashed, just to depict the fact that

we are moving at a different time instances; this is a

common practice that breaks cycles in Bayesian

networks).

Figure 2: The Sense Inference Network.

The sense level models the hidden semantics and the

belief of the network that the senses of a document

are the senses that the user looks for. The senses

nodes are connected to every node at the last level

representing a document where this sense appears

(this can be validated if there exists a path from the

document node at the first level to the sense node).

The document nodes at the last level have an

accumulated belief probability that is used for re-

ranking. The value of this belief is estimated based

on the different senses of the document and denotes

SerfSIN:SearchEnginesResults'RefinementusingaSense-drivenInferenceNetwork

225

the conceptual similarity between the document and

the information need of the user.

5.1 Estimation of Probabilities

and Rearrangement of the Results

Our inference network differs from that in (Turtle,

1991) since we employ it only as a weight

propagation mechanism, using the (Turtle, 1991)

machinery for computing the beliefs at the last level

of the network, provided a set of prior probabilities

for the first level; when a user selects a document its

prior probability is raised and we compute the

change at the beliefs at the last level. Our approach

is different in comparison to (Turtle, 1991) and is

adapted to the problem of the successful

reorganization of search results. (Turtle, 1991)

proposed an information retrieval model while our

work proposes a re-ranking model.

In order to estimate the probabilities for the

nodes of the constructed SIN, we first begin at the

root (documents) nodes. Each document node has a

prior probability that denotes the chance of the

selection of that document from the user. For our

collection and for each document 



this prior

probability will generally be set to be:



, i [1 ]

pd n



 

(3)

where n is the number of the query results.

This probability will change into 1, when a result

is selected in order to denote that this document is

relevant from the user’s point of view. This belief is

transferred through the network to the senses nodes

and then to the final layer representing the document

nodes and changes the values of the classification

weights, re-ranking the results.

For all non-root nodes in the SIN we must

estimate the probability that a node takes, given any

set of values for its parent nodes. We begin with the

term nodes. Each term node contains a specification

of the conditional probability associated with the

node, given its set of parent document nodes. If a

term node t has a set of parents pars

= {d

, … , d

we must estimate the probability P(t|d

, … , d

Here we follow the inference network machinery

described in (Turtle, 1991) employing the tf-idf

weights for the term nodes and setting:

  

|0,50,5*, ()

td ntf t d nidf t

(4)

where ntf() and nidf() are the normalized term

frequency and normalized inverse document

frequency components for term t. In particular, if

tf(t,d

) denotes the number of times term t appears in

, f

the number of its occurrences in the collection

and max_tf(d

) the maximum number of a term’s

occurrences in d

then:

log( )

tf (t,d )

( , ) , ( )

max_tf(d ) logn

ntf t d nidf t

(5)

Based on these weights we can collect the belief for

a term node from its father set by employing the

formed weighted sum link matrix (Turtle, 1991).

The next step is the estimation of the

probabilities for the sense nodes. If a sense node S

has a set of parents pars

= {t

, … , t

}, we must

estimate the probability P(S|t

, … , t

). The

probability of a sense node denotes the importance

of the sense for each term. Initially, the different

meanings for each term are calculated using

WordNet. The probability of a sense to be the

unique sense of a term is equally likely among the

different senses and is defined to 1/ (number of

different senses) for each term, hence we set.

(|) ,

pSt



(6)

where S is a sense of 



and 



is the number of the

different senses of the term





. Based on these

weights we collect the belief for the sense node from

its father set, by employing a weighted sum link

matrix.

Finally, we estimate the probability for a

document node at the last level to be relative to the

user’s interests, based on the network’s structure.

The parents of the documents nodes at the last level

of the network are the senses nodes. Therefore, the

selection from the user of a result gives the network

the ability to distinguish the senses, which interest

the user and the document nodes at the last level get

new weights signifying this knowledge. The senses

nodes are affecting the document nodes according to

their semantics as these are represented through arcs.

We do not give weights but instead use a simple sum

link matrix and the probability/belief for a document

d at the last layer with q father senses is simply the

sum of the beliefs of its father senses divided by q.

The Senses nodes are connected to the

documents nodes for which there exist paths

between the documents and the senses (through the

term nodes). Final step is the computation of the

weights at last level that entail the belief of the

WEBIST2013-9thInternationalConferenceonWebInformationSystemsandTechnologies

226

relatedness of the documents to the selected

document.

The values of the beliefs at the last level of

document nodes















compose a

vector



,





,…,



. Let





,





,…,





, be

the vector of beliefs at the last level before the

selection of any results from the user (initial

probabilities at the first level are all equal). For each

result that the user selects, a new vector is estimated.

In particular when the user selects a single result d,

its probability at the first level is raised to 1, and a

new belief for every document in the last level is

computed. After this weight propagation, the re-

ranking weight R

for document d

is defined as















. Similarly as the user selects further

results, the probabilities of the respective nodes at

the first level are raised to 1 and the beliefs are

recalculated. By repeating this process and

computing the re-ranking weights using formulae (1)

and (2) (note that we always subtract from the initial

weights) we reorganize the results accordingly.

The innovative idea is the use of the senses (nor

the terms) in order re-rank the results. The

importance of the terms modulates the importance of

the senses. When the user selects a result, the terms

of this document get higher values, thus the

respective senses get higher values. Consequently,

all the documents that have these senses get higher

values in the final ranking process. The more senses

(from the selected document) a document has, the

higher ranking value it takes. Thus, the documents

which contain the more senses (and the more times)

from the selected document, are ranked higher.

6 EVALUATION OF SERFSIN’S

PERFORMANCE

To carry out our evaluation, we explored 150 web

queries; 50 queries from the TREC WebTrack 2009

(Clarke et al., 2009), 50 queries from the TREC

WebTrack 2010 (Clarke et al., 2010) and 50 queries

from the TREC WebTrack 2011 (Clarke et al., 2011)

datasets respectively. All tracks employ the 1 billion

page ClueWeb09 collection (http://lemurproject.org/

clueweb09/). These ranked lists of results contain,

for every page, relevance judgments made by human

assessors.

We assessed our network performance by

comparing the rankings it delivered to the rankings

search machines returned for the same set of queries

and results. For our comparisons, we relied on: (i)

the available relevance judgments and (ii) the

Normalized Discounted Cumulative Gain (nDCG)

measure (Jarvelin and Kekalainen, 2000), which

quantifies the usefulness, or gain, of a document

based on its position in the result list. The

assumption in nDCG is that the lower the ranked

position of a relevant result, the less valuable it is for

the user; because it is not likely that it will be

examined by the user. Formally, the nDCG

accumulated at a particular ranking position p is

given by:

(2)

rel

DCG rel nDCG

log i IDCG



 



(7)

where rel

are the document relevance scores and

IDCG

is the Ideal DCG, i.e. the DCG values when

sorting the documents by their relevance.

We set the re-ranking experiment by selecting

randomly a relevant document, after posing each

query and then performing re-ranking; we used

nDCG before and after re-ranking to estimate the

ranking performance.

The experiments carried out can be distinguished

according to three basic parameters (all of the

parameters can be accessed through the SerfSIN

Web Interface at http://150.140.142.5/research/

SerfSIN/):



The search engine selected to perform the initial

query.



The percentage of the resulting pages text used to

identify the terms that feed the second level of the

constructed network (SIN).

 The participation of the initial ranking returned by

the search engine, or the previous ranking of the

results after a user’s selection, in the re-ranking

process, that is the significance of the proposed

network compared to the initial ranking.

Our experiments were carried out using the Indri

search engine over the ClueWeb09 Category B

Dataset. We also utilized two different general

purpose search engines, namely Google and Bing.

We used 50 results in the case of Indri and Bing,

while in the case of Google we used 20 in order to

test if the chosen result set size has any affect on the

attained performance. The queries used, as well as

the relevant judgments rely on the ClueWeb09

Dataset and this is the reason why we selected Indri

as our main experimental search engine. In the case

of the general purpose search engines, there are no

relevant judgments for all the returned pages; hence

we ignored these pages in our measurements.

In relation to the second parameter, that is the

text percentage of the resulting pages used to extract

the network terms, we employ two distinct

SerfSIN:SearchEnginesResults'RefinementusingaSense-drivenInferenceNetwork

227

approaches: we either use the whole text stemming

from the HTML pages, or we use the most important

text contained in the pages (e.g. keywords, meta

description, Google description, h1, h2 tags, strong

words etc.).

Table 1: Parameters' configuration for the twelve

experiments (Most Important Parts: MIP, Full Text: FT).

Exp.

Engine

Text

Coverage

Use of

Initial

Ranking

Web

Track

Queries

1 Indri MIP Yes (β=1) 2009

2 Indri MIP Yes (β=1) 2010

3 Indri MIP Yes (β=1) 2011

4 Indri MIP Yes (β=2) 2011

5 Indri FT Yes (β=1) 2010

6 Indri FT Yes (β=1) 2011

7 Google FT No 2010

8 Google MIP No 2009

9 Google MIP Yes (β=1) 2010

10 Bing MIP No 2011

11 Bing MIP Yes (β=1) 2011

12 Bing FT No 2011

Table 2: nDCG Averages for every experiment.

Exp.

nDCG Aver.

(Before)

nDCG Aver.

(After)

Difference

1 0.259444438 0.309425945

0.04998150

2 0.236960647 0.285369305

0.04840865

3 -0.09917842 0.055169266

0.15434768

0.100722072

0.011697012

0.11241908

5 0.062907506 0.120053429

0.05714592

0.103749409

-0.03013304

0.07361636

7 0.538630812 0.591288965

0.05265815

8 0.525453777 0.593815908

0.06836213

9 0.549054511 0.575225518

0.02617100

10 0.43777936 0.486484735

0.04870537

11 0.434751776 0.468521246

0.03376946

12 0.43111441 0.449078004

0.01796359

Avg: 0.264370611 0.326333024

0.06196241

In the following, we depict the most representative

experimental results from the vast material that we

collected. We summarize all the depicted

configurations (Table 1) and then we particularize

presenting the average nDCG values for every case

(Table 2). We note that the proposed network has

significantly better overall performance by

0.061962413, since the average nDCG values of the

search engines and the proposed network are

0.264370611 and 0.326333024 respectively.

We present twelve experiments with alternative

configurations based on the parameters analyzed in

the previous paragraphs (Table 1). For every

experiment we calculated the total average of nDCG

values, for every rank position of the results. We

also calculated their difference, in order to depict the

corresponding improvement that is evident in all

cases.

In the next figures the experiment’s graphs are

depicted, where for every rank position of the

results, the average nDCG values of the search

engine are compared to the average nDCG values of

the proposed network. We started our analysis with

the Indri search engine, where the experiments took

place in a “controlled” environment employing the

ClueWeb09 Dataset and its relevant judgments and

we continued by studying the proposed network’s

performance for the cases of general purpose search

engines.

Figure 3: Experiment 1.

Figure 4: Experiment 2.

WEBIST2013-9thInternationalConferenceonWebInformationSystemsandTechnologies

228

Figure 5: Experiment 3.

Figure 6: Experiment 4.

Figure 7: Experiment 5.

Figure 8: Experiment 6.

Figure 9: Experiment 7.

Figure 10: Experiment 8.

Figure 11: Experiment 9.

Figure 12: Experiment 10.

Figure 13: Experiment 11.

Figure 14: Experiment 12.

SerfSIN:SearchEnginesResults'RefinementusingaSense-drivenInferenceNetwork

229

In the performed experiments we measured the

proposed network’s behaviour, as a standalone re-

ranking system as well as for the composite case,

where the previous ranking of the results is counted

in the re-ranking process.

As the figures depict, the proposed network

performs well in both cases: when used as a

standalone re-ranking system (experiments 7, 8, 10

and 12), or in the case where the previous ranking is

taken into account. Especially in the experiments 8,

10 where the network was used as the primary re-

ranking engine in contrast to the general purpose

search engines Google and Bing, the performance

was excellent.

It is worth noticing that when the network takes

into account the previous ranking (experiments 1-6,

9 and 11), it behaves better than the search engines

and generally smoother (less rank changes when

reordering) than the case where it acts as a

standalone re-ranking system.

Another interesting observation is that the

overall network performance did not deescalate

when the most important text parts were used for the

terms extraction process, instead of utilizing the full

text. Moreover, as shown in experiment couples (1,

2), (2, 3) and (7, 8) the query sets employed did not

play an important role in the quality of the results, as

the network behaves equivalently in all

configurations.

Finally, in our experiments (we indicatively

depict experiments 3 and 4) we notice that when the

β factor, increases from 1 to 2 the results remain the

same, so in most of the cases the experiments were

performed with β equal to 1; moreover our technique

leads to better performance irrespectively of the

result set size.

Figure 15: A comparison between the two approaches.

We have employed in our experiments the system

described in (Antoniou et al., 2012) and in

Figure 15

a diagram comparing their technique with ours is

depicted, assessing the total average nDCG values of

all the above experiments for the first 20 ranking

positions. It is clear that the new approach performs

better in all positions by 0.033364065, since the

average nDCG values of the previous approach and

the proposed network are 0.284028022 and

0.326333024 respectively.

Moreover, the approach presented in this paper

differs in the way the relevant information is

retrieved, as well as the re-ranking process. The

techniques employed in (Antoniou et al., 2012) are

based on the exploitation of semantic relations and

text coverage between results, while we employ a

variant of a Bayesian inference network. The main

advantages/features of our technique can be

summarized as follows:



Employment of a solid theoretical framework that

can be tuned and be enhanced and not just

heuristic use of terms and senses and their overlap.



Ability to transparently enhance the inference

network with other ontologies.



Due to the inference network employed and our

weight parameter, our technique is more robust to

user errors, and a misselection of a result cannot

affect significantly the final ranking.

It should be finally noted that we employed the

paired t test, and the computed one tailed p-values

were found to be very small (in all cases less than

0.01), proving that the results have significant

differences and thus signifying the superiority of our

technique in a strict statistical sense, and for a

significance level of a=0.05.

7 CONCLUSIONS AND FUTURE

WORK

In this work we have presented a novel framework

for re-ranking the search results of user queries

employing a sense inference network in combination

with semantic based techniques exploiting WordNet.

The framework was applied in three search engines,

and the attained results are quite encouraging

depicting the ability of the proposed technique to

capture the user preferences and produce a

preferable ranking. A minor disadvantage seems to

be the re-ranking process execution time; we plan to

reduce the execution time by carefully tuning our

code, without affecting the search quality. Moreover

we aim to incorporate in our technique other

knowledge bases besides WordNet, such as YAGO

(Suchanek et al., 2007), and BabelNet (Navigli and

Ponzetto, 2010) and to further enhance our inference

network by embedding in it relationships between

synsets present in WordNet.

Our approach comes as an improvement to the

technique that was proposed in (Antoniou et al.,

WEBIST2013-9thInternationalConferenceonWebInformationSystemsandTechnologies

230

2012), and embeds semantic information, but

without exploiting the machinery of the inference

networks. It could be motivating to further explore

the connection between the two approaches, and

propose a unified scheme incorporating both of

them.

ACKNOWLEDGEMENTS

This research has been co-financed by the European

Union (European Social Fund-ESF) and Greek

national funds through the Operational Program

“Education and Lifelong Learning” of the National

Strategic Reference Framework (NSRF)-Research

Funding Program: Heracleitus II. Investing in

knowledge society through the European Social

Fund.

This research has been co-financed by the

European Union (European Social Fund-ESF) and

Greek national funds through the Operational

Program “Education and Lifelong Learning” of the

National Strategic Reference Framework (NSRF)-

Research Funding Program: Thales. Investing in

knowledge society through the European Social

Fund.

Finally, authors also would like to thank the

reviewers for their valuable comments and

suggestions.

REFERENCES

Abdo, A., Salim, N., and Ahmed, A., 2011. Implementing

relevance feedback in ligand-based virtual screening

using Bayesian inference network. Journal of

Biomolecular Screening, 16, 1081–1088.

Acid, S., de Campos, L. M., Fernandez, J. M., and Huete,

J. F., 2003. An information retrieval model based on

simple Bayesian networks. International Journal of

Intelligent Systems, 18, 251–265.

Ahmed, A., Abdo, A., and Salim, N., 2012. Ligand-Based

Virtual Screening Using Bayesian Inference Network

and Reweighted Fragments. The Scientific World

Journal.

Antoniou, D., Plegas, Y., Tsakalidis, A., Tzimas, G., and

Viennas, E., 2012. Dynamic Refinement of Search

Engines Results Utilizing the User Intervention.

Journal of Systems and Software, 85, 7, 1577–1587.

Baeza-Yates, R. and Ribeiro-Neto, B., 2011. Modern

Information Retrieval: the concepts and technology

behind search. Addison Wesley, Essex.

Blanco, R., and Lioma, C., 2012. Graph-based term

weighting for information retrieval. Information

Retrieval, 15, 1 (2012), 54-92.

Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., and

Hwang, D. U., 2006. Complex networks: structure and

dynamics. Physics Reports, 424, 175–308.

Brandt, C., Joachims, T., Yue, Y., and Bank, J., 2011.

Dynamic Ranked Retrieval, In WSDM ‘11.

Callan, J., 2009. The ClueWeb09 Dataset. available at

http://boston.lti.cs.cmu.edu/clueweb09 (accessed 1st

August 2012).

Chapelle, O., and Zhang, Y., 2009. A dynamic Bayesian

network click model for web search ranking. In

Proceedings of the 18th International Conference on

WWW, ACM, New York, USA, pp. 1–10.

Clarke, C. L. A., Craswell, N., and Soboroff, I., 2009.

Overview of the TREC 2009 Web Track. In

Proceedings of the 18th TREC Conference.

Clarke, C.L.A., Craswell, N., Soboroff, I., and Cormack,

G., 2010. Overview of the TREC 2010 Web Track. In

Proceedings of the 19th TREC Conference.

Clarke, C.L.A., Craswell, N., Soboroff, I., and Voorhees,

E. M., 2011. Overview of the TREC 2011 Web Track.

In Proceedings of the 20th TREC Conference.

Fellbaum, C., 1998 WordNet, an electronic lexical

database. The MIT Press.

Howe, A. E., and Dreilinger, D., 1997. SavvySearch: a

meta-search engine that learns which search engines to

query. AI Magazine, 18, 2, 19–25.

Jarvelin, K., and Kekalainen, J., 2000. IR Evaluation

Methods for Retrieving Highly Relevant Documents.

In Proceedings of the 23rd International ACM SIGIR

Conference, 41-48.

Lee, J., Kim, H., and Lee, S., 2011. Exploiting Taxonomic

Knowledge for Personalized Search: A Bayesian

Belief Network-based Approach. Journal of

Information Science and Engineering, 27, 1413-1433.

Liu, T-Y., 2011. Learning to rank for Information

Retrieval. Springer.

Ma, W. J., Beck, J. M., Latham, P. E., and Pouget, A.,

2006. Bayesian inference with probabilistic

population codes. Nat. Neurosci., 9, 1432–1438.

Meng, W., Yu, C., and Liu, K., 2002. Building efficient

and effective metasearch engines. ACM Computing

Surveys, 34, 1, 48–89.

Metzler, D., Turtle, H., and Croft, W. B., 2005. Indri: A

language-model based search engine for complex

queries (extended version). IR 407, University of

Massachusetts.

Navigli, R., and Ponzetto, S. P., 2010. BabelNet: building

a very large multilingual semantic network. In

Proceedings of the 48th Annual Meeting of the

Association for Computational Linguistics (ACL

2010), 216-225.

Niedermayer, D., 2008. Innovations in Bayesian

Networks. Springer.

Porter, M.F., 1980. An algorithm for suffix stripping.

Program, 14, 3, 130−137.

Strohman, T., Metzler, D., Turtle, H., and Croft, B., 2005.

Indri: A Language Model-based Search Engine for

Complex Queries. In Proceedings of the International

Conference on Intelligence Analysis (May 2-6, 2005),

McLean, VA.

SerfSIN:SearchEnginesResults'RefinementusingaSense-drivenInferenceNetwork

231

Suchanek, F. M., Kasneci, G., and Weikum, G., 2007.

YAGO: A core of semantic knowledge. In

Proceedings of the 16th International Conference on

WWW, 697-706.

Tebaldi, C., and West, M., 1998. Bayesian Inference of

Network Traffic Using Link Data. In the Journal of

the American Statistical Association, 557–573.

Teevan, J. B., 2011. Improving Information Retrieval with

Textual Analysis: Bayesian Models and Beyond.

Master’s Thesis, Department of Electrical

Engineering, MIT Press.

Turtle, H. R., 1991. Inference Networks for Document

Retrieval. Ph.D. Thesis.

WEBIST2013-9thInternationalConferenceonWebInformationSystemsandTechnologies

232