DYNAMIC RESOURCE ALLOCATION AND ACTIVE PREDICTIVE

MODELS FOR ENTERPRISE APPLICATIONS

M. Al-Ghamdi, A. P. Chester, L. He and S. A. Jarvis

Performance Computing and Visualisation, Department of Computer Science, University of Warwick, Coventry, U.K.

Keywords:

Dynamic resource allocation, Enterprise applications, Switching policies, Workload prediction.

Abstract:

This work is concerned with dynamic resource allocation for multi-tiered, cluster-based web hosting environ-

ments. Dynamic resource allocation is reactive, that is, when overloading occurs in one resource pool, servers

are moved from another (quieter) pool to meet this demand. Switching servers comes with some overhead, so

it is important to weigh up the costs of the switch against possible system gains. In this paper we combine

the reactive behaviour of two well known switching policies – the Proportional Switching Policy (PSP) and

the Bottleneck Aware Switching Policy (BSP) – with the proactive properties of several workload forecasting

models. Seven forecasting models are used, including Last Observation, Simple Algorithm, Sample Moving

Average, Exponential Moving Algorithm, Low Pass Filter and Autoregressive Moving Average. As each of

the forecasting schemes has its own bias, we also develop three meta-forecasting algorithms (the Active Win-

dow Model, the Voting Model and the Selective Model) to ensure consistent and improved results. We show

that request servicing capability can be improved by as much as 40% when the right combination of dynamic

server switching and workload forecasting are used. As important is that we can generate consistently im-

proved results, even when we apply this scheme to real-world, highly-variable workload traces from several

sources.

1 INTRODUCTION

e-Business applications for on-line banking or on-line

retail are typically hosted on Internet hosting plat-

forms. These applications usually employ a multi-

tiered architecture, which provides a clear separation

of roles and allows each tier to be modiﬁed or re-

placed independently when needed. Commonly, a

multi-tier architecture consists of three tiers: a client-

facing web tier, where the request is received from

the client and from where the response is returned;

an application tier used for the application logic, and

a back-end data-persistence tier that is usually com-

prised of a relational database management system

(RDBMS).

As the use of enterprise applications becomes

more widespread, so issues concerning infrastructure

performance and dependability become more signif-

icant. Clusters of servers, consisting of multiple ho-

mogeneous or heterogeneous computers, have proven

to be a promising and cost-effective approach to meet-

ing these rapidly expanding needs (Yang and Luo,

2000). Thus, the enterprise applications described in

this work are distributed on high-availability, high-

performance clusters of servers.

In this paper, a typical enterprise system is mod-

elled using a multi-class closed queuing network to

compute the various performance metrics (such a rep-

resentation is common, as there is a limit to the

number of simultaneous customers logged into the

system (Menasc

e and Almeida, 2000)). Using this

analytical model it is possible to compute perfor-

mance metrics, identify potential bottlenecks and, im-

portantly, investigate a wide variety of hypothetical

scenarios, without running the actual system. One

should thus envisage such a model running alongside

a real system, where the model can react to param-

eter changes as the application is running (e.g. from

monitoring tools or system logs) and making dynamic

conﬁguration decisions to optimise pre-deﬁned per-

formance metrics (Xue et al., 2008); in this paper

this re-conﬁguration is captured in the switching of

servers from one resource pool (servicing one appli-

cation) to another pool) serving a different applica-

tion.

Most e-Business applications are subject to enor-

mous variations in workload demand (Chester et al.,

2008). System overloading can cause exception-

ally long response times for requests or even er-

rors, caused by the timing out of client requests and

551

Al Ghamdi M., P. Chester A., He L. and A. Jarvis S..

DYNAMIC RESOURCE ALLOCATION AND ACTIVE PREDICTIVE MODELS FOR ENTERPRISE APPLICATIONS.

DOI: 10.5220/0003388705510562

In Proceedings of the 1st International Conference on Cloud Computing and Services Science (CLOSER-2011), pages 551-562

ISBN: 978-989-8425-52-2

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

connections being dropped by the overloaded appli-

cation. At the same time, the throughput of the

system would decrease signiﬁcantly (Cuomo, 2000).

A classic example of this was when the normally

well-provisioned Amazon.com site suffered a forty-

minute downtime due to an overload during the pop-

ular holiday season in November 2000 (Urgaonkar

et al., 2005). Another example is the failure of the

CNN.com website after the terrorist attacks on the

United States on September 11, 2001 (Faraz and Vi-

jaykumar, 2010).

As a result of these (and other) case studies, and

subsequent research, admission control is applied in

order to deal with system overloading. This scheme

is based on assigning priorities to requests and ensur-

ing that less important requests are rejected when the

system is overloaded (Xue et al., 2008).

It has been shown that assigning a ﬁxed number

of servers to a resource pool is clearly sub-optimal:

in many cases resources lay unused and during peak

demand there are insufﬁcient resources to service all

requests (Chester et al., 2008). Dynamic resource al-

location systems on the other hand have been shown

to improve revenue in such environments by reallo-

cating servers (between resource pools) into a more

beneﬁcial conﬁguration.

Workload forecasting approaches can be divided

into two different categories: quantitative and quali-

tative (Menasc

e and Almeida, 2001). The qualitative

approach is a subjective process based on different in-

formation such as expert opinion, historical analogy,

and commercial knowledge. The estimation of future

values based on the existence of historical data, i.e.

that seen in the quantitative approach, is the approach

to forecasting used in this work.

1.1 Paper Contribution and Structure

The contributions of this paper are as follows:

• Through modelling and supporting simulation, we

investigate the combination of two well-known

reactive server switching policies - the propor-

tional switching policy (PSP) and the bottleneck

aware switching policy (BSP) - coupled with the

proactive properties of several workload forecast-

ing models.

• Seven forecasting models are used, including Last

Observation, Simple Algorithm, Sample Moving

Average, Exponential Moving Algorithm, Low

Pass Filter and Autoregressive Moving Average.

As each of the forecasting schemes has its own

bias, we also develop three meta-forecasting al-

gorithms (the Active Window Model, the Voting

Model and the Selective Model) to ensure consis-

tent and improved results.

• We show that request servicing capability can be

improved by as much as 40% when the right

combination of dynamic server switching and

workload forecasting are used. We base our re-

sults on real-world workload traces from several

sources, including from the San Diego Supercom-

puter Centre, from the ClarkNet Internet access

provider for the Metro Baltimore-Washington DC

area and, from the NASA Kennedy Space Center

web-server in Florida.

The remainder of this paper is organised as fol-

lows: Section 2 presents related literature and con-

trasts this with our own work; the modelling of multi-

tiered Internet services and their performance, includ-

ing associated revenue functions, are described in sec-

tion 3; in section 4 we present bottleneck and admis-

sion control systems to enhance the overall system’s

revenue; the dynamic resource allocation policies ap-

plied to our system are found in section 5; in section 6

the workload used in our experimentation, and the as-

sociated predictive algorithms, are introduced; the ex-

perimental setup and results can be found in sections

7 and 8 respectively; the paper concludes in section 9.

2 RELATED WORK

An example of using queueing networks for multi-tier

Internet service modelling can be found in (Zalewski

and Ratkowski, 2006). This work discusses how it is

possible to predict the performance of a multi-tier sys-

tem with a satisfactory accuracy, which itself is im-

portant in the design of most e-business applications.

The queuing network in (Zalewski and Ratkowski,

2006) is solved using the Mean Value Analysis

(MVA) algorithm developed by (Reiser and Laven-

berg, 1980). There are however alternative algorithms

used for analysing queuing networks, including the

Approximate Mean Value Analysis (AMVA) algo-

rithm, where the accuracy and time to solve the model

can be traded (e.g. (Hsieh and Lam, 1988)).

(Liu et al., 2001) focusses on maximising proﬁt of

best-effort requests when combined with requests re-

quiring a speciﬁc quality of service (QoS) in a web

farm; in his work it is assumed that arrival rates of re-

quests are static; our work considers dynamic arrival

rates based on real-world data.

The work in (Cherkasova and Phaal, 2002) in-

troduced an approach called Session-based Admis-

sion Control (SBAC) in order to prevent a web server

from becoming overloaded and to ensure that longer

CLOSER 2011 - International Conference on Cloud Computing and Services Science

552

sessions can be completed in commercial web sites.

We also apply a simple admission control policy in

this research. The policy used here is quite differ-

ent from that found in (Federgruen and Groenevelt,

1986), where an algorithmic approach is used to opti-

mise a resource allocation problem where resources

are given in discrete units; it differs too from the

graph-theoretic approach for solving a resource al-

location optimisation problem, (see (Tantawi et al.,

1988)).

A number of researchers have studied bottleneck

identiﬁcation (e.g. (Litoiu, 2007)) for multi-class

closed product-form queueing networks where there

is no limit to growth. We employ the work in (Casale

and Serazzi, 2004) and (Xue et al., 2008) in collabo-

ration with HP Labs, IBM and the National Business

to Business Centre, where convex polytopes are used

for bottleneck identiﬁcation.

The proportional- and bottleneck-aware- switch-

ing policies are subject to analysis in our previous

work (Xue et al., 2008). The research presented here

is different from that found in (Xue et al., 2008) as (i)

it employs seven model-based workload prediction al-

gorithms and extends the infrastructure to process and

respond to this data, (ii) three meta-forecasting algo-

rithms are introduced, based on the observation that

one predictor alone is potentially sub-optimal, (iii) the

test data (the real-world Internet traces) are consid-

erably larger that those previously explored and are

taken from multiple sources.

The workload itself can be characterised at four

different levels: the business layer, the session

layer, the function layer, and the HTTP-request layer

(Menasc

e, 2003). Here the real-world Internet work-

load is characterised at the second of these levels,

where the set of requests issued from different users

are clustered periodically (see (Arlitt and Williamson,

1996)). Two well-known methods (server-side and

client-side) are used for data collection in web ana-

lytics (Mahanti et al., 2009). In this work we collect

data directly from the web server log ﬁles, where all

transactions and requests to the web site are stored

and undergo systematic analysis.

The work in (Gilly et al., 2004) applies several

predictive techniques to adaptive, web-cluster switch-

ing algorithms. There are two main differences be-

tween their work and ours. First, the system model

is itself quite different; we model the system as two

multi-tiered applications running on two pools, where

servers are moved from another (quieter) pool to deal

with overloading. (Gilly et al., 2004) on the other

hand use a model that consists of a set of servers with

a switch that allocates the incoming request to one of

the servers in the web cluster within a 2-tiered archi-

tecture. Secondly, the system monitoring processes

are also different; in our research we use ﬁxed-time

intervals (see the Active Window Model later in the

paper); (Gilly et al., 2004) monitor their system using

non-ﬁxed intervals (Adaptive Time Slot Scheduling)

based on the system’s request arrival rate.

Various regression methods, nonlinear methods,

moving average, and exponential smoothing algo-

rithms have been used for workload prediction in Web

services environments, (see (Menasc

e and Almeida,

2001)). (Keung et al., 2003) used several of the pre-

dictors found here (the Last Observation, Sample Av-

erage, Low Pass Filter, and the ARIMA model) to

predict the behaviour of data-exchange in the Globus

Grid middleware MDS. Running Average, Single

Last Observation, and Low Pass Filter were applied

in (Dushay et al., 1999) for optimising the choice

of indexers made by query mediators (QMs) (reduc-

ing the QM wait time and thus the user wait time

for search results from a distributed digital library).

This said, none of these predictive methods have

previously been employed alongside reactive server

switching policies; thus this work demonstrates a

number of new results in this area.

3 MODELLING OF

MULTI-TIERED INTERNET

SERVICES

A description of the system model which has been

used in this work, together with the associated rev-

enue function, are presented. The notation used in

this paper is summarised in table 1.

3.1 The System Model

We use a multi-class closed queueing network to rep-

resent our system of interest, where C, WS, AS, and

DS represent the Client, Web Server, Application

Server, and Database Server respectively. The appli-

cation is modelled using both -/M/1 ﬁrst-come-ﬁrst-

served and -/M/m- ﬁrst-come-ﬁrst-served in each sta-

tion, and it is assumed that servers are clustered at

each tier.

Mean value analysis (MVA) (Reiser and Laven-

berg, 1980), based on Little’s law (Little, 1961) and

the Arrival Theorem (Rolia et al., 2004), is applied

to solve the closed queueing network model. The

simplicity of MVA (in contrast with convolution al-

gorithms (Cavendish et al., 2010)) and the accuracy

of the results (Xue et al., 2008) are the main reasons

for using MVA in this work.

DYNAMIC RESOURCE ALLOCATION AND ACTIVE PREDICTIVE MODELS FOR ENTERPRISE APPLICATIONS

553

Figure 1: A model of a typical conﬁguration of a cluster-

based multi-tiered Internet service.

Table 1: Notation used in this paper.

Symbol Description

Service time of job class-r at station i

Visiting ratio of job class-r at station i

N Number of service stations in QN

K Number of jobs in QN

R Number of job classes in QN

Number of class-r jobs at station i

Number of servers at station i

Revenue of each class-r job

Marginal probability at centre i

T System response time

Deadline for class-r jobs

Exit time for class-r jobs

Probability that class-r job remains

Class-r throughput before switching

Class-r throughput after switching

Utilisation at station i

Server switching time

Switching decision interval

In a multi-class closed queuing network S

repre-

sents the service time, which is deﬁned as the average

time spent by a class-r job during a single visit to sta-

tion i; v

represents the visiting ratio of class-r jobs

at station i.

(Litoiu, 2007) deﬁnes the service demand D

the sum of the service times at a resource over all vis-

its to that resource during the execution of a transac-

tion or request, thus D

= S

• v

. The total popula-

tion of the network (K) is the total population at each

class-r.

The reminder of the performance parameters used

in this work include: the mean system response time

(k), the throughput of class-r jobs X

; the mean

queue length K

, and the utilisation per-class station

(k). These terms are all described in detail in (Xue

et al., 2008).

3.2 Revenue Functions

An Internet hosting centre supports many multi-tier

applications for its clients, each of which will have

separate service level agreements (SLAs). The SLA

deﬁnes the level of service agreed between the client

and the hosting centre and may include performance

and availability targets, with penalties to be paid if

such targets are not met. It is in the interests of the

service hosting centre to ensure that its SLAs are met

so that it can maximise its revenue, whilst ensuring

that its resources are well utilised.

The maximum revenue P(T

) is obtained when

the client’s request is met within its deadline D

While revenue obtained from requests which are not

served within the deadline decreases linearly to zero,

at which point the request exits the system E

using

the equation; P(T

) = (E

- T

) / (E

- D

); where T

represents the request’s response time.

With respect to the probability of the request exe-

cution, the lost V

loss

and gained V

gain

revenue are cal-

culated using the equations 1 and 2 respectively, with

the assumption that the servers are switched from pool

i to pool j.

Note that because the servers are being switched,

they can not be used by both pools i and j during

the switching process and the time that the migra-

tion takes cannot be neglected. The revenue gain from

the switching process is calculated during the switch-

ing decision interval time t

as shown in equation 2,

where the switching decision interval is greater than

the switching time.

loss

∑

r=1

)φ

P(T

−

∑

r=1

)φ

P(T

(1)

gain

∑

r=1

)φ

P(T

)(t

−t

) −

∑

r=1

)φ

P(T

)(t

−t

)

(2)

After calculating the lost and gained revenue using

equations 1 and 2, servers may be switched between

the pools. In this paper servers are only switched be-

tween the same tiers, and only when the revenue gain

is greater than the revenue lost.

4 BOTTLENECK AND

ADMISSION CONTROL

It is known that the resources that limit the overall

performance of the system are the congested ones,

referred to as the bottlenecks (Casale and Serazzi,

2004). In (Xue et al., 2008) we show that the bot-

tleneck may occur at any tier and may shift between

tiers; we also demonstrate that the system may enter a

state where more than one tier becomes a bottleneck.

The work uses the convex polytopes approach where

the set of potential bottlenecks in a network with one

CLOSER 2011 - International Conference on Cloud Computing and Services Science

554

thousand servers, two different server pools and ﬁfty

customer classes, can be computed in just a few sec-

onds.

Figure 2 shows example bottleneck identiﬁcation

results using convex polytopes for a chosen server-

pool conﬁguration. We see that when the percentage

of gold class jobs is less than 46.2%, the web server

tier is the bottleneck; when it is between 46.2% and

61.5%, the system enters a crossover region, where

the bottleneck changes; when the percentage of gold

class jobs in pool 1 exceeds 61.5%, the application

server tier becomes the bottleneck. Thus it is clear

that bottleneck identiﬁcation should be one of the ﬁrst

steps in any performance study; any system upgrade

which does not remove the bottleneck(s) will have

no impact on the system performance at high loads,

see (Marzolla and Mirandola, 2007).

Admission control offers a possible solution to

the overloading problem which can cause a signif-

icant increase in the response time of requests. A

simple admission control policy has been developed

(see (Xue et al., 2008)) and has been applied in this

research. This policy works by dropping less valuable

requests when the response time exceeds a threshold,

and therefore maintaining the number of concurrent

jobs in the system at an appropriate level.

5 SERVER SWITCHING

POLICIES

In a statically allocated system comprising many

static server pools, a high workload may exceed the

capacity of one of the pools causing a loss in rev-

enue, while lightly loaded pools may be considered

as wasted resources if their utilisation is low. In other

words, when the workload level is high, allocating a

ﬁxed number of servers is insufﬁcient for one applica-

tion, whereas it is a wasted resource for the remaining

applications while the workload is light. Dynamic re-

source allocation has been shown to provide a signif-

icant increase in total revenue through the switching

of available resources in accordance with the changes

in each of the application’s workloads. The policies

which we utilise here are the Proportional Switching

Policy (PSP) and the Bottleneck-aware Switching Pol-

icy (BSP).

5.1 Proportional Switching Policy

The proportional switching policy was ﬁrst presented

in (Xue et al., 2008) and then subsequently analysed

in (Al-Ghamdi et al., 2010). This policy works by al-

Figure 2: Bottleneck identiﬁcation using complex poly-

topes.

locating servers at each tier in proportion to the work-

load and subject to an improvement in revenue.

5.2 Bottleneck-aware Switching Policy

There are some factors that may affect the system’s

performance (e.g. workload mix and revenue con-

tribution from individual classes of job in different

pools). The second algorithm which is used in this

work is the bottleneck-aware switching policy, which

overcomes these factors in order to obtain improved

performance results. This is a best-effort algorithm

(Xue et al., 2008).

Both policies operate by re-calculating the dis-

tribution of servers between pools at ﬁxed intervals.

This re-calculation is based on changes in the system

which have occurred in the last time period, thus the

schemes are reactive in that the respond to changes

in the system. The time intervals at which this re-

calculation is done can be varied; however, there is a

cost associated with the switching of servers between

pools (for a period these servers are off-line, persis-

tent data needs to be swapped in and out etc.) and a

balance must be struck to avoid thrashing.

6 THE WORKLOAD AND

PREDICTIVE MODELS

The observation of past values in order to anticipate

future behaviour represents the essence of the fore-

casting process as seen in this paper. Numerous pre-

dictors are discussed and the way in which they are

applied in the context of dynamic resource allocation

is analysed. Our premise is that workload forecast-

ing may assist revenue-generating enterprise systems

which already employ methods of dynamic resource

allocation; however, as with forecasting in other do-

mains, the predictions may in fact be wrong, and this

DYNAMIC RESOURCE ALLOCATION AND ACTIVE PREDICTIVE MODELS FOR ENTERPRISE APPLICATIONS

555

100

200

300

400

500

600

700

800

0 15 30 45 60 75

No. of Requests

Time Periods (mins)

Pool1

Pool2

Figure 3: A sample of the total requests for both application

pools.

may result in server reallocation to the detriment of

the service.

As in previous capacity planning work (Menasc

and Almeida, 2001), we generate a workload model

from the characterisation of real data. The predictive

forecasting that has been used in this work is based on

past values, using several different predictors: Last

Observation (LO), Simple Algorithm (SA), Sample

Moving Average (SMA), Exponential Moving Av-

erage (EMA), Low Pass Filter (LPF), and an Au-

toregressive Integrated Moving Average (ARIMA).

These forecasting algorithms are also combined to-

gether in several different ways in order to gener-

ate meta-forecasting models – Active Window Model

(AWM), Voting Model (VM), and Selective Model

(SM) – which are then combined with the the switch-

ing policies – the proportional switching policy (PSP)

and the bottleneck aware switching policy (BSP). The

forecasting and server switching therefore work in

tandem.

6.1 The Workload

The workload is deﬁned as the set of all inputs the sys-

tem receives from its environment during any given

period of time (Menasc

e and Almeida, 2001). In this

study the workloads (e.g. see ﬁgure 3) are based

on Internet traces containing two days, two weeks,

and two months worth of HTTP requests. The ﬁrst

workload is generated from two real-world Internet

traces containing 76,086 requests in total and each of

which contains a days worth of HTTP requests to the

EPA WWW server located at Research Triangle Park,

NC and the SDSC WWW server located at the San

Diego Supercomputer Center in California respec-

tively (LBNL, 2008). The second workload has been

collected from ClarkNet WWW server which is a

full Internet access provider for the Metro Baltimore-

Washington DC area (Arlitt and Williamson, 1996).

This workload contains 3,328,587 requests issued to

the server during the period of two weeks. The

third workload used in this research is obtained

from the NASA Kennedy Space Center web-server

in Florida (Arlitt and Williamson, 1996). This trace

contain 3,461,612 requests spanning two months.

In typical fashion (see also (Menasc

e and

Almeida, 2001)) we characterise this workload to

form a workload model, which can then be used as

the input to our system model.

6.2 Predictive Models

The predictive algorithms – Last Observation (LO),

Simple Algorithm (SA), Sample Moving Average

(SMA), Exponential Moving Algorithm (EMA), Low

Pass Filter (LPF), Autoregressive Integrated Mov-

ing Average Model (ARIMA) – are documented

elsewhere (Al-Ghamdi et al., 2010). The meta-

forecasting models, Active Window Model (AWM),

Voting Model (VM), and Selective Model (SM) are

described below:

6.2.1 Active Window Model (AWM)

Figure 4 demonstrates the change in revenue that re-

sults from applying each of the seven forecasting pre-

dictors to the NASA workload under the PSP switch-

ing policy. It is clear that while there are similar trends

over time, some predictors produce better results than

others. It is also the case that the results are not con-

sistent, that is, one predictor does not consistently per-

form better than all the others.

Revenue

Increasing Time Periods

SMA

EMA

LPF

AR1

AR2

Figure 4: Revenue samples from applying the seven predic-

tors (NASA workload, PSP switching policy).

In the AWM model, the data points for all pre-

dictors are collected during a ﬁxed period (the Active

CLOSER 2011 - International Conference on Cloud Computing and Services Science

556

Window). The gained revenue from each predictor

is compared with the original revenue with no fore-

casting. The best predictor, i.e. that which results

in the highest revenue for the last period (along with

the switching policy), is then used for the next period.

In this model the active windows have varying dura-

tion: 5m, 10m, 15m, 20m, 25m, 30m, 1h, 2h, 12h,

and 24h; where m and h represent minutes and hours

respectively.

6.2.2 Voting Model (VM)

The voting model (VM) is based on the following sce-

nario (for both switching policies): First, each of the

different predictors are applied to the system, these

predictions are acted upon and the system is reconﬁg-

ured accordingly (resulting in one system conﬁgura-

tion for each of the predictors/server-switching poli-

cies); the system (re-)conﬁguration chosen most often

(i.e. with the most votes) is then applied to the system

proper. This system clearly requires more calculation

within the model, as we are deciding on the ﬁnal state

of the system as opposed to simply an up-front pre-

diction of workload.

6.2.3 Selective Model (SM)

The selective model works by choosing those predic-

tors that have performed best during the past time pe-

riod, and employing these for the next time period.

SM(B2) calculates the mean of the best two predictors

(as compared to the original system without predic-

tors). Several other selective models are also applied,

including the selective model with the best three or

four predictors (SM(B3)) and (SM(B4)), and the se-

lective model with an average of all the predictors

computed and then applied (SM(AVG)).

In each case, the choice of workload prediction

technique is dynamic; that is, no one prediction tech-

nique is applied throughout the system lifetime. This

aim of such an approach is to avoid bias and to en-

sure that the variability in the workload (which we

inevitably see between the variety of input sources) is

somehow accounted for. Figure 4 highlights the need

for such a scheme; the Low Pass Filter predictor, for

example, can produce the second-best revenue in one

time period, to be followed by the second-worst rev-

enue in the subsequent time period. Workload clearly

impacts on the effectiveness of the prediction and dy-

namic server reallocation combined.

7 EXPERIMENTAL SETUP

We have developed a supporting simulator to allow

us to verify the behaviour of our theoretical models.

We prime the simulator with measured values from an

in-house test platform, or use values from supporting

literature where these are not attainable. We simulate

two multi-tiered applications running on two logical

pools (1 and 2) on a cluster of servers. There are two

different classes of job (gold and silver), which rep-

resent the importance of these jobs. The service time

and the visiting ratio v

are both based on realistic

(i.e. sampled) values.

In this work, three different models have been de-

veloped to compute the request servicing capability:

1) the Active Window Model (AWM); 2) the Vot-

ing Model (VM) and; 3) the Selective Model (SM).

In each case, the results show the base-line revenue

when no switching policy is applied (NSP) and also

the case when the switching policy alone (without

forecasting) is applied (NP). These provide good indi-

cators against which the new results can be compared.

8 RESULTS

The results from applying our new predictive models

along with the dynamic server switching policies on

three real-world Internet traces are shown in the tables

2, 3, and 4 and ﬁgures 5, 6, and 7, and are described

in the following sections.

8.1 Experiment One

The ﬁrst experiment is conducted using the ﬁrst work-

load which has been generated from two real-world

Internet traces to the EPA WWW server located at

Research Triangle Park, NC and the SDSC WWW

server located at the San Diego Supercomputer Cen-

ter, California respectively (LBNL, 2008). The re-

sults from the different predictive models based on

this workload are shown in table 2. This table rep-

resents the gained revenue from applying the seven

predictive algorithms – LO, SA, SMA, EMA, LPF,

AR1, and AR2. In this case one or other of the pre-

dictors are applied consistently throughout the exper-

iment. Figure 5 represents the revenue that has been

achieved using the three meta-forecasting models –

AWM, VM, and SM; this therefore represents a com-

bination of applied predictors, depending on the de-

tails of the scheme.

The two server switching policies used in this

work (PSP and BSP) provide 5.63% and 103.47%

improvement in system revenue compared to the non

switching policy (NSP). The results from table 2 show

a 12.15% improvement in the system revenue when

the AR1 algorithm is applied with PSP, compared to

DYNAMIC RESOURCE ALLOCATION AND ACTIVE PREDICTIVE MODELS FOR ENTERPRISE APPLICATIONS

557

Table 2: Revenue gains for switching policy and forecasting combinations under the ﬁrst workload.

PSP + Predictive Algorithm

Policy NSP PSP LO SA SMA EMA LPF AR(1) AR(2)

Total Revenue 614.1 648.7 650 647.9 645.7 677.8 651.3 727.5 660.6

Improvement over PSP (%) - 0 0.2 -0.12 -0.46 4.49 0.4 12.15 1.83

BSP + Predictive Algorithm

Policy NSP BSP LO SA SMA EMA LPF AR(1) AR(2)

Total Revenue 614.1 1249.5 1341.3 1333.7 1333.9 1360.6 1332.8 1346.5 1333.4

Improvement over BSP (%) - 0 7.35 6.74 6.75 8.89 6.67 7.76 6.71

Table 3: Revenue gains for switching policy and forecasting combinations under the second workload.

PSP + Predictive Algorithm

Policy NSP PSP LO SA SMA EMA LPF AR(1) AR(2)

Total Revenue 632.6 810.6 786.1 758.6 803.9 811 816.1 825 820.9

Improvement over PSP (%) - 0 -3.02 -6.41 -0.83 0.05 0.68 1.78 1.27

BSP + Predictive Algorithm

Policy NSP BSP LO SA SMA EMA LPF AR(1) AR(2)

Total Revenue 632.6 2200.1 2273.3 2306.2 2360.3 2309.2 2182.8 2155.5 2279.3

Improvement over BSP (%) - 0 3.33 4.82 7.28 4.96 -0.79 -2.03 3.6

600

700

800

900

1000

1100

1200

1300

1400

1500

NSP

AWM(5m)

AWM(10m)

AWM(15m)

AWM(20m)

AWM(25m)

AWM(30m)

AWM(1h)

AWM(2h)

AWM(12h)

SM(T2)

SM(T3)

SM(T4)

SM(AVG)

Total Revenue (x10^5)

Algorithm

PSP

BSP

Figure 5: Revenue using Active Window Model (AWM),

Voting Model (VM), and Selective Model (SM) under the

ﬁrst workload.

that from the original PSP without prediction. The re-

mainder of the predictive algorithms also yielded bet-

ter results than the original revenue computed from

the PSP scheme without forecasting, with just two ex-

ceptions – when the predictors SA and SMA are used

the revenue drops by -0.12% and -0.46% respectively.

Table 2 also show that applying the predictive al-

gorithms with BSP provides at least 6.67% improve-

ment in system revenue with LPF and up to 8.89%

when EMA is applied alongside BSP.

The achieved revenue from using the three differ-

ent meta-forecasting models are shown in ﬁgure 5.

The Active Window Model (AWM) performs the best

on the given workload and the revenue can be up to

14.06% higher when it is applied with PSP and up

to 36.70% higher with BSP when it is applied every

one and twelve hours respectively. While these re-

sults look encouraging, we sound a note of caution

by highlighting the fact that the improvement drops

by -0.43%, -8.27%, and -22.22% when the AWM is

applied with BSP for every 10m, 20m, and 2h re-

spectively. This method is clearly sensitive to work-

load and its employment and conﬁguration therefore

would need to be subject to realistic trials if it is to be

most effective.

The voting model (VM) is not effective with PSP

(resulting in a 3.45% drop in revenue), yet it does sig-

nify improvements with BSP (resulting in a 9.71%

improvement in revenue) when its performance is

compared to the switching policy with no forecasting.

The Selective Model is able to bring about improve-

ments to both switching policies, revenue is up 2.07%

CLOSER 2011 - International Conference on Cloud Computing and Services Science

558

500

1000

1500

2000

2500

NSP

AWM(5m)

AWM(10m)

AWM(15m)

AWM(20m)

AWM(25m)

AWM(30m)

AWM(1h)

AWM(2h)

AWM(12h)

SM(T2)

SM(T3)

SM(T4)

SM(AVG)

Total Revenue (x10^5)

Algorithm

PSP

BSP

Figure 6: Revenue using Active Window Model (AWM),

Voting Model (VM), and Selective Model (SM) under the

second workload.

using a combination of SM(B4) and PSP, and is up

7.33% using a combination of SM(B2) and BSP.

8.2 Experiment Two

We repeat the experiments with the second work-

load – this contains 3,328,587 requests issued to the

ClarkNet WWW server over the period of two weeks.

The resulting values are shown in table 3 and ﬁgure

6. Table 3 shows that the system revenue is improved

by 28.14% and 247.79% using PSP and BSP respec-

tively compared to NSP. The ﬁgure also shows a fur-

ther 1.78% and 7.28% improvement when the AR1

and SMA predictors are applied with PSP and BSP

respectively.

AWM provide further improvement (by 12.24%

and 8.32%) when it is applied every twelve hours with

PSP and BSP on the given workload. VM decreases

the performance of the system by -0.58% when it is

used with PSP and up to -25.29% with BSP.

The highest improvements that can be achieved

using SM with the PSP and BSP are 5.82% and

5.50% (when applied as SM(B2) and SM(B4)). There

is again a lack of consistency however as revenue

drops by -1.12% (using SM(B4)) and -26.32% (using

SM(3)) with PSP and BSP respectively.

8.3 Experiment Three

Finally we undertake the same experiments using the

third workload, which contains 3,461,612 requests to

the the NASA Kennedy Space Center web-server in

Florida. Results are shown in table 4 and ﬁgure 7. It

is shown that the revenue of the system can be im-

proved as much as 69.19% and 44.14% when BSP

and PSP are applied to the system compared with the

NSP; a further 45.91% and 0.76% improvement can

500

600

700

800

900

1000

1100

1200

1300

NSP

AWM(5m)

AWM(10m)

AWM(15m)

AWM(20m)

AWM(25m)

AWM(30m)

AWM(1h)

AWM(2h)

AWM(12h)

SM(T2)

SM(T3)

SM(T4)

SM(AVG)

Total Revenue (x10^5)

Algorithm

PSP

BSP

Figure 7: Revenue using Active Window Model (AWM),

Voting Model (VM), and Selective Model (SM) under the

third workload.

be achieved when the SA and AR(1) are applied to

the system along with BSP and PSP respectively (see

table 4).

The AWM performs the best compared with the

other two models where the system revenue can be

improved by 4.68% and 40.69% with the PSP and

BSP policies, when AWM is applied every ten and

twenty minutes respectively (the improvement is be-

tween 0.96%-1.63% and 29.33%-39.71% using the

remaining categories of AWM with PSP and BSP re-

spectively).

When VM and SM are applied with PSP, it does

not provide a good improvement in system revenue (-

3.19%) with VM and from 0.08% to -2.98% with SM.

VM however gives a good improvement in system

revenue with BSP where the improvement reaches

15.08%. SM also provide a reasonable improvement

(from 8.21% with SM(B2) and up to 36.56% with

SM(B3)) in system revenue when applied alongside

BSP.

8.4 Analysis

Tables 5, 6 and 7 provide a useful summary of these

ﬁndings. Essentially we contrast an enterprise sys-

tem with ﬁxed resources (NSP) with several alter-

natives: a system that employes a dynamic server

switching policy (PSP or BSP); a system that uses

PSP or BSP, and a single forecasting scheme; and ﬁ-

nally a system that employes PSP or BSP, and a meta-

forecasting scheme. There are some interesting ob-

servations from this data

• Dynamic server switching (using PSP or BSP) im-

proves revenue in all cases. The Bottleneck Aware

Switching policy is particularly effective;

• Using a single forecasting scheme in tandem with

DYNAMIC RESOURCE ALLOCATION AND ACTIVE PREDICTIVE MODELS FOR ENTERPRISE APPLICATIONS

559

Table 4: Revenue gains for switching policy and forecasting combinations under the third workload.

PSP + Predictive Algorithm

Policy NSP PSP LO SA SMA EMA LPF AR(1) AR(2)

Total Revenue 513.8 740.6 718.0 703.1 742.2 745.0 730.7 746.2 740.5

Improvement over PSP (%) - 0 -3.05 -5.06 0.22 0.59 -1.34 0.76 -0.01

BSP + Predictive Algorithm

Policy NSP BSP LO SA SMA EMA LPF AR(1) AR(2)

Total Revenue 513.8 869.3 940.0 1268.4 1164.6 978.3 1260.7 1083.5 1206.3

Improvement over BSP (%) - 0 8.13 45.91 33.97 12.54 45.02 24.64 38.77

Table 5: Analysis of the ﬁrst workload.

Policy NSP PSP Best Single Policy Worst Single Policy Best meta-policy

Total Revenue 614.1 648.7 (AR1) 727.5 (SMA) 645.7 (AWM(1h)) 739.9

Improvement over PSP (%) - 0 12.15 -0.46 14.06

Policy NSP BSP Best Single Policy Worst Single Policy Best meta-policy

Total Revenue 614.1 1249.5 (EMA) 1360 (LPF) 1332.8 (AWM(30m)) 1420.3

Improvement over BSP (%) - 0 8.84 6.67 13.67

Table 6: Analysis of the second workload.

Policy NSP PSP Best Single Policy Worst Single Policy Best meta-policy

Total Revenue 632.6 810.6 (AR1) 825 (SA) 758.6 (AWM(12h)) 909.8

Improvement over PSP (%) - 0 1.78 -6.41 12.24

Policy NSP BSP Best Single Policy Worst Single Policy Best meta-policy

Total Revenue 632.6 2200.1 (SMA) 2360.3 (AR1) 2155.5 (AWM(12h)) 2383.1

Improvement over BSP (%) - 0 7.28 -2.03 8.32

Table 7: Analysis of the third workload.

Policy NSP PSP Best Single Policy Worst Single Policy Best meta-policy

Total Revenue 513.8 740.6 (AR1) 746.2 (SA) 703.1 (AWM(20m)) 752.7

Improvement over PSP (%) - 0 0.76 -5.06 1.63

Policy NSP BSP Best Single Policy Worst Single Policy Best meta-policy

Total Revenue 513.8 869.3 (SA) 1268.4 (LO) 940.0 (AWM(20m)) 1223

Improvement over BSP (%) - 0 45.91 8.13 40.69

PSP or BSP is difﬁcult. First, no one scheme wins

out across all workload (the best single policy in-

cludes AR(1), EMA, SMA, SA over our three

workloads). Second, if the wrong scheme is cho-

sen, this may indeed reduce the overall revenue

generated (it does so in more than half the cases

we test);

• The meta-forecasting schemes always improve

revenue when used in tandem with PSP or BSP. In

the worst case the improved revenue will be negli-

gible (1.63%, workload three, PSP, AWM(20m));

in the best case the revenue may be increased

by around 40% (40.69% workload three, BSP,

AWM(20m));

• The Active Window Model (AWM) proves to

be the best scheme in all cases; on average

this scheme gives an improvement in revenue of

15.1% over all three real-world workloads. The

size of the active window is important and must

therefore be subject to some pre-calculatation

based on sample traces.

CLOSER 2011 - International Conference on Cloud Computing and Services Science

560

9 CONCLUSIONS AND FUTURE

WORK

Through modelling and supporting simulation, we

combine the reactive behaviour of two well known

switching policies – the Proportional Switching Pol-

icy (PSP) and the Bottleneck Aware Switching policy

(BSP) – with the proactive properties of several work-

load forecasting models. Seven forecasting models

are used, including Last Observation, Simple Algo-

rithm, Sample Moving Average, Low Pass Filter and

Autoregressive Moving Average. As each of the fore-

casting schemes has its own bias, we also develop

three meta-forecasting models (the Active Window

Model, the Voting Model and the Selective Model)

to ensure consistent and improved results.

We base our results on real-world workload traces

from several sources, including from the San Diego

Supercomputer Centre, from the ClarkNet Internet

access provider for the Metro Baltimore-Washington

DC area and, from the NASA Kennedy Space Cen-

ter web-server in Florida. For each of the three real-

world workloads, we contrast an enterprise system

with ﬁxed resources (no switching policy) with sev-

eral alternatives: a system that employes a dynamic

server switching policy (PSP or BSP); a system that

uses PSP or BSP, and a single forecasting scheme;

and ﬁnally a system that employes PSP or BSP, and

a meta-forecasting scheme.

The results are signiﬁcant in a number of respects:

(i) Dynamic server switching (using PSP or BSP) im-

proves revenue in all cases, the Bottleneck Aware

Switching policy is particularly effective; (ii) Using a

single forecasting scheme in tandem with PSP or BSP

is difﬁcult as no one scheme wins out across all work-

loads and, if the wrong scheme is chosen, this may

lead to a reduction in the overall revenue generated

by the system; (iii) The meta-forecasting schemes al-

ways improve revenue when used in tandem with PSP

or BSP, in the worst case the improved revenue will

be negligible, in the best case the revenue may be

increased by around 40%; (iv) The Active Window

Model (AWM) proves to be the best scheme in all

cases, on average this scheme gives an improvement

in revenue of 15.1% over all three real-world work-

loads, the size of the active window is important and

must therefore be subject to some pre-calculatation

based on sample traces.

We are currently investigating the effectiveness

of these schemes in extreme (highly bursty) environ-

ments.

REFERENCES

Al-Ghamdi, M., Chester, A. P., and Jarvis, S. A. (2010).

Predictive and dynamic resource allocation for enter-

prise applications. In Proceedings of the 2010 10th

IEEE International Conference on Scalable Comput-

ing and Communications (ScalCom), pages 2776–

2783, Washington, DC, USA. IEEE Computer Soci-

ety.

Arlitt, M. and Williamson, C. (1996). Web server workload

characterization: the search for invariants. SIGMET-

RICS Perform. Eval. Rev., 24(1):126–137.

Casale, G. and Serazzi, G. (2004). Bottlenecks identiﬁ-

cation in multiclass queueing networks using convex

polytopes. In 12th Annual Meeting of the IEEE Int’l

Symposium on Modelling, Analysis, and Simulation of

Comp. and Telecommunication Systems (MASCOTS).

Cavendish, D., Koide, H., Oie, Y., and Gerla, M. (2010).

A mean value analysis approach to transaction perfor-

mance evaluation of multi-server systems. Concurr.

Comput. : Pract. Exper., 22(10):1267–1285.

Cherkasova, L. and Phaal, P. (2002). Session-based ad-

mission control: A mechanism for peak load manage-

ment of commercial web sites. IEEE Trans. Comput.,

51(6):669–685.

Chester, A. P., Xue, J. W. J., He, L., and Jarvis, S. A. (2008).

A system for dynamic server allocation in applica-

tion server clusters. In ISPA ’08: Proceedings of the

2008 IEEE International Symposium on Parallel and

Distributed Processing with Applications, pages 130–

139, Washington, DC, USA. IEEE Computer Society.

Cuomo, G. (2000). IBM WebSphere Application Server

Standard and Advanced Editions; A methodology for

performance tuning. IBM.

Dushay, N., French, J. C., and Lagoze, C. (September

1999). Predicting indexer performance in a distributed

digital library. In Third European Conference on Re-

search and Advanced Technology for Digital Libraries

(ECDL99), Paris, France.

Faraz, A. and Vijaykumar, T. (2010). Joint optimization of

idle and cooling power in data centers while maintain-

ing response time. SIGPLAN Not., 45(3):243–256.

Federgruen, A. and Groenevelt, H. (1986). The greedy pro-

cedure for resource allocation problems: Necessary

and sufﬁcient conditions for optimality. Oper. Res.,

34(6):909–918.

Gilly, K., Alcaraz, S., Juiz, C., and Puigjaner, R. (2004).

Comparison of predictive techniques in cluster-based

network servers with resource allocation. Modeling,

Analysis, and Simulation of Computer Systems, Inter-

national Symposium on, pages 545–552.

Hsieh, C. and Lam, S. (1988). Pam-a noniterative approx-

imate solution method for closed multichain queue-

ing networks. SIGMETRICS Perform. Eval. Rev.,

16(1):261–269.

Keung, H. N. L. C., Dyson, J. R. D., Jarvis, S. A., and Nudd,

G. R. (2003). Predicting the performance of globus

monitoring and discovery service (mds-2) queries. In

Proceedings of the 4th International Workshop on

DYNAMIC RESOURCE ALLOCATION AND ACTIVE PREDICTIVE MODELS FOR ENTERPRISE APPLICATIONS

561

Grid Computing, GRID ’03, pages 176–, Washington,

DC, USA. IEEE Computer Society.

LBNL (2008). Internet Trafﬁc Archive Hosted

at Lawrence Berkeley National Laboratory.

http://ita.ee.lbl.gov/html/traces.html.

Litoiu, M. (2007). A performance analysis method for au-

tonomic computing systems. ACM Transactions on

Autonomous and Adaptive Systems (TAAS), 2(1):3.

Little, J. ((May - Jun., 1961)). A proof for the queuing for-

mula: L= λ w. Operations Research, 9(3):383–387.

Liu, Z., Squillante, M., and Wolf, J. (2001). On maximizing

service-level-agreement proﬁts. In EC ’01: Proceed-

ings of the 3rd ACM conference on Electronic Com-

merce, pages 213–223, New York, NY, USA. ACM.

Mahanti, A., Williamson, C., and Wu, L. (2009). Work-

load characterization of a large systems conference

web server. In Proceedings of the 2009 Seventh An-

nual Communication Networks and Services Research

Conference, pages 55–64, Washington, DC, USA.

IEEE Computer Society.

Marzolla, M. and Mirandola, R. (2007). Performance

prediction of web service workﬂows. The third In-

ternational Conference on the Quality of Software-

Architectures (QoAS), pages 127–144.

Menasc

e, D. (2003). Workload characterization. In IEEE

Internet Computing, pages 89–92, Piscataway, NJ,

USA. IEEE Educational Activities Department.

Menasc

e, D. and Almeida, V. (May 7, 2000). Scaling

for E-Business: Technologies, Models, Performance,

and Capacity Planning. Prentice Hall, Upper Saddle

River, NJ.

Menasc

e, D. and Almeida, V. (September 21, 2001). Ca-

pacity Planning for Web Services: Metrics, Models,

and Methods. Prentice Hall, Upper Saddle River, NJ.

Reiser, M. and Lavenberg, S. (1980). Mean-value analysis

of closed multichain queuing networks. Journal of the

Association for Computing Machinary, 27(2):313–

322.

Rolia, J., Zhu, X., Arlitt, M., and Andrzejak, A. (2004).

Statistical service assurances for applications in utility

grid environments. Perform. Eval., 58(2+3):319–339.

Tantawi, A., Towsley, G., and Wolf, J. (1988). Optimal al-

location of multiple class resources in computer sys-

tems. SIGMETRICS Perform. Eval. Rev., 16(1):253–

260.

Urgaonkar, B., Shenoy, P., Chandra, A., and Goyal, P.

(2005). Dynamic provisioning of multi-tier internet

applications. In ICAC ’05: Proceedings of the Sec-

ond International Conference on Automatic Comput-

ing, pages 217–228, Washington, DC, USA. IEEE

Computer Society.

Xue, J. W. J., Chester, A. P., He, L., and Jarvis, S. A.

(2008). Dynamic resource allocation in enterprise sys-

tems. In ICPADS ’08: Proceedings of the 2008 14th

IEEE International Conference on Parallel and Dis-

tributed Systems, pages 203–212, Washington, DC,

USA. IEEE Computer Society.

Yang, C. and Luo, M. (2000). Realizing fault resilience

in web-server cluster. In Proceedings of the 2000

ACM/IEEE conference on Supercomputing, Super-

computing ’00, Washington, DC, USA. IEEE Com-

puter Society.

Zalewski, A. and Ratkowski, A. (2006). Evaluation of de-

pendability of multi-tier internet business applications

with queueing networks. In Proceedings of the In-

ternational Conference on Dependability of Computer

Systems, pages 215–222, Washington, DC, USA.

IEEE Computer Society.

CLOSER 2011 - International Conference on Cloud Computing and Services Science

562