OPTIMAL INFORMATION GATHERING SCHEME OVER A

SCALABLE GRID INFORMATION SERVICES ARCHITECTURE

Nianjun Zhou, Dikran Meliksetian, Jean-Pierre Prost, Irwin Boutboul

150 Kettletown Road, Southbury, CT 06488, USA

Keywords: Information gathering, cost optimization, Markov Chain, information reduction, grid computing, dynamic

programming.

Abstract: A proactive model for gathering resource attribute values in large scale distributed systems is proposed and

analyzed within the context of resource discovery. This model, based on a tree topology, relies on

information reduction to limit the amount of information collected at each node of the tree structure and to

minimize information update and query cost. A Markov chain is used to model resource attribute value

changes. This model is solved using dynamic programming to determine the optimal reduction scheme that

minimizes the overall cost of updating and querying resource attributes.

1 INTRODUCTION

Grid computing (Kaufmann, 2004; Foster et al.,

2001; Foster et al., 2002) enables the virtualization

of distributed computing and data resources such as

CPU, network bandwidth and storage including

memory to create a single system image, granting

users and applications seamless access to vast IT

capabilities. One of the key issues is to maintain

accurate information about the entities constituting

the system. Entity state is typically represented by

attributes, exhibiting values, which change over

time. As the system size (i.e. the number of

resources in the system) grows, maintaining an

accurate representation of its state becomes a real

challenge, since the number of messages required to

maintain this state scales with the number of

resources. Typical models used for information

gathering, such as the Globus Toolkit MDS

(Czajkowski, et al., 2001; Zhang, Schopf, 2004),

take a reactive approach. In such models, it is only

once a query (explicit or implicit) for a resource

attribute value is submitted that the relevant

information is fetched. In order to improve the

latency incurred in answering queries, caching

techniques are used to store the information in

aggregator nodes until a pre-defined time-to-live

period expires. A tree topology is used, where the

leaves represent the system resources and upper

level nodes are aggregator nodes. In these models

however, the amount of information stored at each

aggregator node grows linearly with the number of

resources registered to report their attribute values to

them, and once cached information becomes stale,

querying it may lead to important latencies. This

method is efficient when the query to the system is

less frequent than the changes of the attribute value.

In other grid information systems, peer to peer

interactions are used among aggregator nodes to

improve scalability (Mastroianni et al., 2005), or

filtering techniques (Balaton et al, 2002) and

publish-subscribe mechanisms (Jie, 2004; Cooke et

al., 2004) allow for the selection of the information

to be gathered and thus limit the amount of

information aggregated. However, in these systems,

depending upon which aggregator is queried, the

resulting information can vary greatly.

Ganglia (

Massie et al., 2004) is a scalable distributed

monitoring system for high performance computing

environment. It is based on a hierarchical design

targeted at federations of clusters. It relies on a

multicast-based listen/announce protocol to monitor

state within clusters and uses a tree of point-to-point

connections amongst representative cluster nodes to

federate clusters and aggregate their state. Although

Ganglia is a monitoring system and while the

objective of the proposed scheme in this paper is

data acquisition and resource discovery; the two

share a common proactive approach for data

acquisition, and both are concerned with minimizing

the network load required for this operation. Since

the objectives of the two systems are different, the

approaches for load minimization are different.

While Ganglia uses standard data compression

techniques, we use an information reduction scheme

as described in the subsequent sections.

184

Zhou N., Meliksetian D., Prost J. and Boutboul I. (2006).

OPTIMAL INFORMATION GATHERING SCHEME OVER A SCALABLE GRID INFORMATION SERVICES ARCHITECTURE.

In Proceedings of WEBIST 2006 - Second International Conference on Web Information Systems and Technologies - Internet Technology / Web

Interface and Applications, pages 184-189

DOI: 10.5220/0001256801840189

Copyright

c

SciTePress

In this paper, we propose a proactive model, which

triggers information updates up the tree structure

each time the attribute value of a resource changes.

In addition, we reduce information going up the tree

by consolidating at each node the values of a given

attribute from all the nodes reporting to it into a set

of attribute value intervals. Upon an explicit or

implicit user query, the reduction nodes are

consulted in a hierarchical fashion until the response

can be formulated. The introduction of the

information reduction provides us a mechanism of

not propagating the un-needed information for the

queries to the high levels of the topology. The key

issue becomes the determination of the optimal

number and ranges of the intervals at each tree level

to minimize the cost for updating and querying a

resource attribute.

This paper is organized as follows. In Section 2, we

formulate our optimization problem in the case of a

single attribute. In Section 3, we describe a

stochastic model where the process of the attribute

value change is represented as a Markov chain. In

Section 4, we solve the model to determine the

update and the query costs associated with a single

attribute. In Section 5, we demonstrate that the

determination of the optimal set of attribute value

intervals at each tree level to achieve least cost is a

dynamic programming problem, and we propose an

algorithm to determine the optimal solution. In

Section 6, we conclude our paper with future plans,

including the extension of this research into multiple

attribute scenarios and a comparison with the

reactive model.

2 PROBLEM FORMULATION

As described above, we have a multi-level tree

topology, in which leaves are system resources and

upper level nodes are reduction nodes. We assume a

fully balanced tree, in which each node has a limited

fixed number of children. The leaf nodes of our tree

structure are the system resources and are monitored

regularly for attribute value changes. Each time an

attribute value changes, some of the reduced

information maintained for this attribute by the

reduction nodes it reports to (directly or indirectly –

we will refer to these nodes as its reporting branch),

may have to be updated. In other words, some of the

numbers of resources in each attribute value interval

of the reporting nodes may have to be incremented

or decremented. We want to minimize the number of

updates up the reporting branch. Therefore, if we

were to only look at the update cost, the best

solution is to use coarse attribute value intervals.

We must also take in consideration the query cost.

Unlike updates, which are triggered from leaf nodes

to the root, queries start at the root. If we only use

coarse intervals, depending on the attribute value

query (assumed to be of the form “find all the

resources for which attr

≤

value”), we may end up

traversing most of the tree, since most top level

reduction nodes will have resources reported in each

attribute value interval. Therefore, the query cost

may be expensive.

The whole problem is to find the optimal set of

attribute value intervals that achieves the best trade-

off between the update cost and the query cost. We

define the cost as the number of messages

exchanged between nodes (either between a resource

and a reduction node, or between two reduction

nodes).

3 OUR MODEL

Each node has a fixed number of nodes reporting

directly to it, called n. K represents the number of

levels in the tree and N the number of leaf nodes (i.e.

system resources), Level 0 always represents the

leaf nodes and level K the root. Finally,

NK

n

log

=

(1)

3.1 Attribute Change Model

We assume that the attribute can only have discrete

numerical values. We let

M

represent the number

of attribute values and the set

{}

M

aaa ,...,,

21

represent the possible values of the attributes. We

model the attribute value change process using

Markov chain and the changes of the attribute at

different leaf nodes are identical independent

distributions. We further assume that the attribute

value can either stay the same, or increase and

decrease by one unit within a time interval as a

birth-and-death process (Leon-Garcia, 1994). This

assumption is valid if we limit the time interval to an

arbitrary small value. We denote by

)(

0

ip the

equilibrium probability that the value of the attribute

is

i

a , and by )(

0

if and )(

0

ir the forward and

backward transition probabilities from

i

a to

1+i

a

and

1−i

a respectively. The equilibrium probability

)(

0

ip can be deduced from the transition

probabilities. For our analysis, we assume that these

transition probabilities are known as prior

knowledge.

3.2 Update, Query and Cost Models

Updates occur at the same regular time intervals as

attribute changes. Updates are triggered up the tree,

starting from the system resource whose attribute

value either increased or decreased, along its

OPTIMAL INFORMATION GATHERING SCHEME OVER A SCALABLE GRID INFORMATION SERVICES

ARCHITECTURE

185

reporting branch. The number and identities of

resources falling into each attribute value interval of

each reduction node are adjusted when appropriate.

An update is generated by a leaf node only when the

change of the attribute would cause a change in the

attribute value interval where the value falls at the

parent node. Similarly, an update is generated by a

reduction node only if the change will cause a

change of the attribute value interval of its parent

node.

The number of attribute value intervals at a given

level

k , is denoted by

k

l , and the intervals are

indexed with

{}

k

li ,,1 L∈ . The attribute value of a

given leaf node will belong to one and only one

attribute value interval at level

k . We use

k

I to

represent the random variable corresponding to the

attribute value interval index that the attribute value

of a given resource belongs. It is obvious that the

changes of

k

I are Markov birth-and-death processes.

We use

)(ip

k

to denote the equilibrium probability

that

)( iI

k

= . The transition probabilities of

k

I

from interval

i to interval

()

1+i and

()

1−i during

a time interval are denoted by

)(if

k

and )(ir

k

respectively. Based on our earlier assumption, all

other transition probabilities are null.

We consider only simple queries of the form “find

all the system resources for which value of attribute

x is less or equal to value

j

a ”, where

j

a represents

the jth discrete numerical value that x can have.

In this study, we only consider the cost related to

network traffic. We consider the message payload

cost to be negligible compared to the message

assembly, transfer, and de-assembly, hence the cost

will be expressed as the total number of messages

exchanged between nodes, while accounting for all

potential attribute value updates and queries.

4 UPDATE AND QUERY COSTS

The total number of leaf nodes that are in the subtree

rooted by a node at level

k is

k

n . The number of

nodes at level

k is equal to

⎟

⎠

⎞

⎜

⎝

⎛

k

n

N

. The total

number of nodes at level k and above is denoted by

k

m and is equal to:

1

1

1

0

−

−

====

+−

−

==

−

=

∑∑∑

n

n

nn

n

N

m

kK

kK

j

j

K

kj

jK

K

kj

j

k

(2)

Given the attribute value intervals at all levels, and a

the Markov chain model for the attribute values

specified by

)(

0

ip , )(

0

if and )(

0

ir , the equilibrium

probability distribution and the transition

probabilities at all levels can be computed as

follows. Consider the i

th

attribute value interval

[

]

ii

yx

aa ,

at level k . The probability of the attribute

value belonging to the i

th

interval is equal to the sum

of the probabilities that the attribute has values

i

x

a

through

i

y

a

:

∑

=

=

i

i

y

xj

k

jpip )()(

0

(3)

The transition probabilities are determined similarly:

)(

)()(

)(

00

ip

yfyp

if

k

ii

k

=

(4) and

)(

)()(

)(

00

ip

xrxp

ir

k

ii

k

=

(5)

Each update generates one message, while a query to

a particular node consists of a request and a

response. We will take this factor into consideration

in the following subsections.

For the purpose of this analysis we are assuming a

given fixed reduction scheme that specifies the

attribute value intervals at all levels, a given attribute

value distribution by

)(

0

ip , )(

0

if and )(

0

ir , a query

activity model specified by

F

, the frequency of

query requests within a given time interval, and

j

q ,

the probability that the query of the form “find all

leaf nodes that have the attribute value less than or

equal to

j

a ” refer to the particular value

j

a .

Update Cost

Given

)(ip

k

, )(if

k

and )(ir

k

, we can determine the

update cost by determining the expected number of

messages exchanged for an update. A given leaf

node will cause an update to be generated from a

node at level

k to a node at level )1( +k , if the

attribute value crosses attribute value interval

boundaries at level

)1(

+

k . The probability of this

event is denoted as

k

u , where 10 −≤≤ Kk , and is

equal to:

[]

∑

+

=

+++

+=

1

1

111

)()()(

k

l

i

kkkk

irifipu

(6)

The probability that a node at level

k , 10

−

≤≤ Kk ,

will actually generate an update is equal to the

probability that at least one of the

k

n leaf nodes in

the subtree rooted at this node causes an update; this

probability,

k

U is:

(

)

k

n

kk

uU −−= 11

(7)

The expected number of updates, i.e. the update

cost

CU , is determined by adding these probabilities

at all nodes:

()

[

]

∑∑

−

=

−

=

−−==

1

0

1

0

11

K

k

K

k

n

k

k

k

k

k

u

n

N

U

n

N

CU

(8)

Query Cost

The query cost is equal to twice the number of nodes

that need to be visited to respond to the query. In the

following sections, we consider a simple query of

WEBIST 2006 - INTERNET TECHNOLOGY

186

the form “find all leaf nodes that have the attribute

value less than or equal to

j

a

”, where

j

a

is one of

the numerical values of the attribute. We consider

two different variations for counting the visited

nodes.

Case I: It is obvious that if the query is

addressed to a node where

j

a is equal to the upper

bound of one of the attribute value intervals

],[

yx

aa

,

i.e.

yj

aa = , the node can definitely respond to the

query. Since the attribute value intervals are the

same for all nodes at a given level, the simplest way

of counting the nodes visited is to count all the

nodes down to the level where all the nodes have an

attribute value interval

],[

yx

aa

. Let

)(

j

ak

denote

this level, then the total number of nodes visited is

equal to the total number of nodes from the root,

level

K

, down to and including level

)(

j

ak

. Using

equation (2), this number is equal to:

1

1

1)(

)(

−

−

=

+−

n

n

m

j

j

akK

ak

(9)

Consequently, the query cost for searching the leaf

nodes satisfying

j

ax ≤ is:

1

1

22)(

1)(

)(

−

−

==

+−

n

n

mjcq

j

j

akK

ak

(10)

and the weighted total cost for all possible values of

j

a is

()

∑

=

+−

−

−

=

M

j

akK

j

j

nq

n

F

CQ

1

1)(

1

1

2

(11)

where

F

is the frequency of query requests within a

given time interval and

j

q is the probability that the

query refers to a particular

j

a

.

Case II: A more interesting and challenging

case occurs when we consider closely the conditions

under which we can prune the search space even

before we reach level

)(

j

ak . Obviously, if we

encounter a node

g

at level )()(

j

akgk > such that

the attribute value interval

],[

yx

aa

that contains

j

a

has no resources, node

g

can respond to the query

without consulting its children. Let the index of the

attribute value interval

],[

yx

aa

at this node be

denoted by

)(

)(

ji

gk

. Note that this index is a

function of only the level

)(gk of the node and not

of the node itself because all the nodes at a given

level have the same attribute intervals. The

probability of this attribute value interval having no

resources is equal to the probability that none of the

)( gk

n leaf nodes in the sub-tree rooted by

g

have an

attribute in this interval. In our notation this is equal

to

()()

)(

)(1

)()(

gk

n

gkgk

jip−

(12)

where

)(ip

k

is the probability that the attribute value

of a leaf node falls in the

i

th

interval of level k .

In this case, we want to determine the expected

number of visited nodes given the

)(ip

k

distributions at all levels. With the current notation,

the probability that the children of

g

are visited is:

(

)

(

)

)(

)(11

)()(

gk

n

gkgk

jip−−

(13)

and consequently, the expected number of the

children of

g

that are visited is

(

)

(

)

(

)

)(

)(11

)()(

gk

n

gkgk

jipn −−

(14)

Since there are

)(gk

n

N

nodes at level )(gk and there

are n children nodes for node

g

, and the probability

distributions of these nodes are independent from

each other, the expected number of visited nodes at

level

(

)

1)(

−

gk is:

()()

(

)

k

n

kk

k

jip

n

N

)(11

1

−−

−

(15)

Note that we have dropped the explicit dependence

of

)(gk on

g

since all nodes at level )(gkk = are

identical.

The query starts from the root. It is obvious that the

root has to be visited with probability one, while for

the lower levels the expected number of visited

nodes is given by the equation derived earlier.

Consequently the total expected number of visited

nodes is:

()()

(

)

∑

+=

−

−−+

K

akk

n

kk

k

j

k

jip

n

N

1)(

1

)(111

(16)

The corresponding expected cost for a given

j

a is:

()()

(

)

∑

+=

−

−−+=

K

akk

n

kk

k

j

k

jip

n

N

jcq

1)(

1

)(1122)(

(17)

and

()()

(

)

∑∑

=+=

−

−−

+

=

M

j

K

akk

n

kk

k

j

j

k

jip

n

N

qF

FCQ

11)(

1

)(112

2

(18)

Total Cost

Case I:

From (8) and (11), we derive:

()

[

]

()

∑

∑

=

+−

−

=

−

−

+−−=+=

M

j

akK

j

K

k

n

k

k

j

k

nq

n

F

u

n

N

CQCUC

1

1)(

1

0

1

1

2

11

(19)

Case II: From (8) and (18), we derive:

OPTIMAL INFORMATION GATHERING SCHEME OVER A SCALABLE GRID INFORMATION SERVICES

ARCHITECTURE

187

()

(

)

()()

(

)

∑∑

∑

=+=

−

−

=

−−

++−−=+=

M

j

K

akk

n

kk

k

j

K

k

n

k

k

j

k

k

jip

n

N

qF

Fu

n

N

CQCUC

11)(

1

1

0

)(112

211

(20)

5 ANALYSIS OF THE

OPTIMIZATION PROBLEM

As described earlier, the optimization problem

considered in this paper is to determine the attribute

value intervals at all levels of the hierarchy given the

attribute value distribution and the query

probabilities. First, let us consider the possible

attribute value intervals, called partitions hereafter.

The total number of possible partitions of an

attribute with

M

discrete values is:

()

()

∑

=

−

=

−−

−

M

i

M

iiM

M

0

1

2

!!1

!1

. (21)

Let

Π denote the set of all partitions and let us

define a “coarser or equal to” relationship between

the partitions. A partition

Π∈

P

is coarser or equal

to another partition

Π∈

′

P

, denoted

P

P

′

f , if the

intervals of

P

are unions of one or more of

consecutive intervals of

P

′

.

A particular feasible solution consists of a sequence

of partitions

{}

011

,,,, PPPP

KK

L

−

where

k

P represents

the partition of level

k ,

0

P is always the partition

consisting of singleton intervals

P

1

, and

ji

PP f for

all

ji ≥ .

The cost functions

{}()

011

,,,, PPPPC

KK

L

−

that were

calculated in the previous sections are for a

particular feasible solution

{}

011

,,,, PPPP

KK

L

−

, and

the optimization problem is to determine a solution

{}

****

011

,,,, PPPP

K

K

L

−

that minimizes C .

5.1 Dynamic Programming

Formulation

In the following, we rewrite the cost equations to

highlight the dynamic programming (Ecker and

Kupferschmid, 1998) nature of the problem. We

will consider the results for Case II of the previous

section; however the same approach is also valid for

case I.

The cost for a particular feasible solution

{}

011

,,,, PPPP

KK

L

−

was derived as in equation (20).

The constant term in that equation has no

consequence on the solution. Let us consider the last

term

CQ

and change the order of summations:

()()

(

)

∑∑

=+=

−

−−=

M

j

K

akk

n

kk

k

j

j

k

jip

n

N

qFCQ

11)(

1

)(112

(22)

()()

()()

(

)

∑

∑

=

=

−

−−×+−

×=

M

j

n

kkjj

K

k

k

k

jipakksq

n

N

FCQ

1

1

1

)(111)(

2

(23)

where

(

)

(

)

1)(

+

−

j

akks is the step function, i.e.:

()()

⎩

⎨

⎧

+≥

=+−

otherwise0

1 if1

1)(

)k(ak

akks

j

j

(24)

Let us define, for a given

k , the set )(kJ of all

indices

j

such that

j

a does not appear as a right

boundary at level

k or above.

)(kJ

is equal to:

{

}

)( and 1)(

j

akkMjjkJ >≤≤=

(25)

CQ can then be written as:

()()

(

)

∑∑

∈=

−

−−=

)(1

1

)(112

kJj

n

kkj

K

k

k

k

jipq

n

N

FCQ

(26)

With these transformations, the cost function can be

written as:

()

(

)

()()

(

)

∑∑

∑

∈=

−

=

−

−

−−

++−−=

−

)(1

1

1

1

1

)(112

211

1

kJj

n

kkj

K

k

k

K

k

n

k

k

k

k

jipq

n

N

F

Fu

n

N

C

(27)

Finally:

()

(

)

()()

(

)

⎟

⎟

⎠

⎞

⎜

⎜

⎝

⎛

−−+−−

×+=

∑

∑

∈

−

=

−

−

)(

1

1

1

)(11211

2

1

kJj

n

kkj

n

k

K

k

k

kk

jipqFu

n

N

FC

(28)

By definition,

)(kJ

can be uniquely determined

knowing the partitions in the sub-

sequence

{

}

kKK

PPP ,,,

1

L

−

, )( ji

k

and other terms in

the second part are uniquely determined by the

partition

k

P . Finally, since:

[]

∑

+

=

+++

+=

1

1

111

)()()(

k

l

i

kkkk

irifipu

(6)

1−k

u is uniquely determined by the partition

k

P .

These considerations drive us to define the partial

cost function

{

}

(

)

iKK

PPPC ,,,

1

L

−

, for Ki ≤≤1 , as:

{}()

()

(

)

()()

(

)

⎟

⎟

⎠

⎞

⎜

⎜

⎝

⎛

−−+−−

×+=

∑

∑

∈

−

=

−

+−

−

)(

1

1

11

)(11211

2,,,,

1

kJj

n

kkj

n

k

K

ik

k

iiKK

kk

jipqFu

n

N

FPPPPC L

(29)

or equivalently:

WEBIST 2006 - INTERNET TECHNOLOGY

188

{}(){}()

()

(

)

()()

(

)

⎟

⎟

⎠

⎞

⎜

⎜

⎝

⎛

−−+−−

+=

∑

∈

−

−

+−+−

−

)(

1

1

1111

)(11211

,,,,,,,

1

kJj

n

iij

n

i

i

iKKiiKK

ii

jipqFu

n

N

PPPCPPPPC LL

(30)

with:

{}()

()

()

()()

()

⎟

⎟

⎟

⎠

⎞

⎜

⎜

⎜

⎝

⎛

−−

+−−

×+

=

∑

∈

−

−

−

)(

1

1

)(112

11

2

1

KJj

n

KKj

n

K

K

K

K

K

jipqF

u

n

N

FPC

(31)

where the second term is a function of

{}

iiKK

PPPP ,,,

11 +−

L and none of the partitions

k

P

for

ik < .

The following algorithm formalizes the derivation in

the previous subsections.

1. For every

Π∈

P

, calculate

{}()

PC .

2. For

1−= Kk to 1 do:

2.a For every

Π∈

P

, identify all partial solutions

{}

11

,,,

+− kKK

PPP L such that PP

k

f

1+

.

2.b For every partial solution in step 2.a do:

2.b.i Calculate

{}()

PPPPC

kKK

,,,

11 +−

L .

2.b.ii Determine the

{}

***

11

,,,

+− kKK

PPP L that

minimizes

{}()

PPPPC

kKK

,,,

11 +−

L .

2.b.iii Add

{}

PPPP

kKK

,,,,

***

11 +−

L

to the set of

potential solutions of level

k .

6 FUTURE WORK

In summary, we have introduced a new architecture

for a scalable grid information service and modeled

it in order to obtain the optimal parameters of the

system in the particular case of a single attribute

with known attribute model and query distribution

model. We are working on extending this approach

to multiple resource attributes. By assuming that

multiple attributes share the same index topology for

information update and query, and that the update

and query messages and their responses for all

attributes are merged together, we can calculate the

cost given the aggregation intervals for each

attribute. The aggregation intervals themselves can

be found by using the method given in Section 5 as a

first order of approximation.

Another avenue for future research is the overall

cost vs. accuracy comparison with reactive

information gathering scheme. We believe that there

is a threshold beyond which the proactive

information gathering scheme is better than the

reactive information gathering scheme. This

threshold depends on 1) how quickly the attribute

changes; 2) how often queries occur and 3) the time

to live values of the reactive caching.

REFERENCES

Kaufmann, M., 2004. The Grid: Blueprint for a new

computing infrastructure, 2

nd

ed., ISBN: 1-55860-933-

4K.

Foster, I.; Kesselman, C.; and Tuecke, S., 2001, “The

anatomy of the grid: Enabling scalable virtual

organizations,” International J. Supercomputer

Applications, 15(3).

Foster, I; Kesselman, C.; Nick, J; Tuecke, S. June 22,

2002, “The physiology of the grid: An open grid

services architecture for distributed systems

integration,” Open Grid Service Infrastructure WG,

Global Grid Forum,

http://www.globus.org/alliance/publications/papers/og

sa.pdf.

Czajkowski, K.; Fitzgerald, S.; Foster, I.; and Kesselman,

C., August 2001,“Grid information services for

distributed resource sharing,” Tenth IEEE

International Symposium on High-Performance

Distributed Computing, IEEE Press.

Zhang, X.; Schopf, J., April 2004. “Performance analysis

of the Globus Toolkit monitoring and discovery

service, MDS2,” Proc. International Workshop on

Middleware Performance (MP 2004), 23rd

International Performance Computing and

Communications Workshop (IPCCC).

Mastroianni, C.; D. Talia, D.; and Verta, O., 2005. “A

superpeer model for building resource discovery

services in grids: Design and simulation analysis,”

Advances in Grid Computing - EGC : European Grid

Conference, Amsterdam, The Netherlands, P.M.A.

Sloot et al. Eds., Springer-Verlag.

Balaton, Z.; Gombás, G.; and Németh, Zs., 2002.

“Information system Architecture for brokering in

large scale grids,” Parallel and Distributed Systems:

Cluster and Grid Computing (Proceedings of

DAPSYS 2002, Linz), Kluwer, pp. 57-65.

Jie, C., February 2004. “Index grid services using Globus

Toolkit 3.0,” IBM developerWorks, http://www-

128.ibm.com/developerworks/grid/library/gr-

indexgrid/.

Cooke A. W. et al., December 2004. “The relational grid

monitoring architecture: Mediating information about

the grid”, Journal of Grid Computing, Vol. 2 No. 4.

Massie, M.; Chun, B; and Culler, D., 2004. “The Ganglia

distributed monitoring system: Design,

implementation, and experience”, Parallel Computing

Vol. 30, pp.817–840.

Leon-Garcia, A., 1994. “Probability and random

processes for electrical engineering”, 2

nd

ed.,

Addison-Wesley Publishing Company, pp. 459-498.

Ecker J. and Kupferschmid, M., 1998. “Introduction to

operations research”, Krieger Publishing Company,

1988, pp. 347-374.

OPTIMAL INFORMATION GATHERING SCHEME OVER A SCALABLE GRID INFORMATION SERVICES

ARCHITECTURE

189