Multi-objective Optimization for Virtual Machine Allocation in

Computational Scientiﬁc Workﬂow under Uncertainty

Arun Ramamurthy

, Priyanka Pantula

, Mangesh Gharote

, Kishalay Mitra

and Sachin Lodha

TCS Research and Innovation, Tata Consultancy Services, India

Indian Institute of Technology, Hyderabad, India

Keywords:

Cloud Computing, Scientiﬁc Workﬂow, Resource Allocation, Multi-objective Optimization, NSGA-II,

Chance Constrained Programming.

Abstract:

Providing resources and services from various cloud providers is now an increasingly promising paradigm.

Workﬂow applications are becoming increasingly computation-intensive or data-intensive, with resource al-

location being maintained in terms of pay per usage. In this paper, a multi-objective optimization study for

scientiﬁc workﬂow in a cloud environment is proposed. The aim is to minimize execution time and purchas-

ing cost simultaneously while satisfying the demand requirements of customers. The uncertainties present in

the model are identiﬁed and handled using a well-known technique called Chance Constrained Programming

(CCP) for real-world implementation. The model is solved using the Non-dominated Sorting Genetic Algo-

rithm – II (NSGA-II). This comprehensive study shows that the solutions obtained on considering uncertainties

vary from the deterministic case. Based on the probability of constraint satisfaction, the objective functions

improve but at the cost of reliability of the solution.

1 INTRODUCTION

Cloud computing has emerged as a popular paradigm,

where computing resources are provided based on the

demand raised by the users in terms of pay per use

pricing mechanism (Aslam et al., 2017; Ferdaus et al.,

2017). In the cloud platform, often a data center

manages large-scale Virtual Machines (VMs), which

are useful for execution of computational intensive

tasks. In the present commercial environment, a di-

verse range of VM types is provided with varying

prices by each cloud provider. The user has to select

the best resources for execution of a particular task.

Therefore, providing optimal resources and services

from cloud providers is a vital paradigm of research

(Mohammadi et al., 2018; Hu et al., 2018; Heilig

et al., 2020; Ramamurthy et al., 2020).

In the cloud environment, the optimal VM alloca-

tion for scientiﬁc workﬂow is formulated by consid-

ering two main aspects: (i) cost components such as

purchasing cost, resource sharing cost and so on. (ii)

execution time, along-with meeting users’ require-

ment. The objective function varies linearly with re-

spect to both these components and in literature, this

allocation problem is known to be NP-hard (Madni

et al., 2016). The decision variables here include

the number of VMs allocated, conﬁguration of VMs

(provided by the cloud provider), the time at which a

VM is allocated and the total execution time. Besides

the challenges related in solving NP-hard problems,

in practical scenario, the user’s requirements such as

memory, storage capacity and so on, might be non-

deterministic. Often they may vary either due to the

dependence on the percentage of work completion,

uncertainties in task execution or inaccurate estima-

tion of requirements. Thus, there is a need to opti-

mally allocate VMs along-with simultaneous consid-

eration of cost components, execution time and un-

certain requirements, which are bounded rather than

a ﬁxed value.

Under the uncertain situations, for the ease of han-

dling the optimization routine, most of the times, the

problems are assumed to be deterministic and solved

using deterministic optimization algorithms. How-

ever, such a study might lead to unrealistic solu-

tions or decisions under practical scenarios (Diwekar,

2020). To illustrate, while trying to minimize the

computation cost of an application under dynami-

cally changing demand of a resource, the determin-

istic demand-based cost might deteriorate the appli-

cation efﬁciency or increase the energy consumption

during the periods of over- and under- estimated val-

240

Ramamurthy, A., Pantula, P., Gharote, M., Mitra, K. and Lodha, S.

Multi-objective Optimization for Virtual Machine Allocation in Computational Scientiﬁc Workﬂow under Uncertainty.

DOI: 10.5220/0010453302400247

In Proceedings of the 11th International Conference on Cloud Computing and Services Science (CLOSER 2021), pages 240-247

ISBN: 978-989-758-510-4

ues of demand. As a result, over the past few years,

uncertainty handling during decision making has been

gaining importance in both the industrial as well as

academic sectors of research (Diwekar, 2020; Ning

and You, 2019).

Some of the well-known optimization under un-

certainty handling techniques include Stochastic Pro-

gramming, Chance Constrained Programming (CCP),

Robust Optimization, Expected Value Model and

Fuzzy Mathematical Programming (Diwekar, 2020).

Among these, CCP emerges as one of the popular

approaches for efﬁciently dealing with uncertainties.

It is applied to diverse domains of research includ-

ing the topics from scheduling, process modelling

and optimization, process design and so on (Odetayo

et al., 2018; Wang and Ning, 2017). In CCP, the con-

straints need to be necessarily satisﬁed with a pre-

deﬁned value of probability, rather than all the real-

izations of uncertain parameters. However, the reli-

ability of the solution is dependent on the probabil-

ity of constraint satisfaction. In order to make such a

complicated probabilistic formulation more tractable,

the CCP formulation is converted to an equivalent

deterministic formulation, which is then dealt using

any deterministic optimization techniques. In optimal

VM allocation problem, since the uncertain parame-

ters (users’ requirements) are linear in the constraints,

the CCP approach can be applied by implementing

coordinate transformation or by calculating classical

probability values (Mitra, 2013). Additionally, the

problem size in CCP is manageable even when the

number of uncertain parameters increases.

Despite the diverse range of applications of CCP,

in VM allocation problem, the uncertainties arising

are rarely addressed in literature. It might be due to

the presence of hard deterministic optimization prob-

lem. Apart from that, the deterministic VM allocation

optimization problem turns out to be multi-objective

in nature. Various trade-off can occur such as fast

implementation, low-cost resources, secured and reli-

able resources, less energy wastage and so on. How-

ever, in most of the existing works, the constrained

Multi-Objective Optimization Problem (MOOP) is

considered as a constrained single objective opti-

mization problem, which is proven to be a less ef-

ﬁcient way of solving MOOPs since it needs to be

solved multiple times for achieving the complete set

of Pareto-Optimal (PO) solutions (Heilig et al., 2016).

Owing to the formulation of MOOP in determinis-

tic case, the uncertain optimization problem also re-

sults in multiple objectives where some of the con-

straints remain uncertain. Therefore, there is a need

to solve the multi-objective optimization problem of

VM allocation efﬁciently along with consideration of

uncertain user requirements. Nevertheless, the prob-

lem is non-trivial as the resources need to be allocated

optimally at each time instance over the entire time

horizon along with identiﬁcation of the right resource

conﬁguration, in the presence of varying yet bounded

requirements.

In this paper, the aforementioned determinis-

tic MOOP of resource allocation is solved using

a well-known evolutionary optimization algorithm

called Non-dominated sorting Genetic Algorithm –

II (NSGA – II) that is capable of handling the con-

ﬂicting objectives efﬁciently. Moreover, some of the

uncertain parameters present in the problem are iden-

tiﬁed and the optimization under uncertainty prob-

lem is solved using CCP. The solutions thus obtained

are analyzed and signiﬁcant conclusions are drawn.

The overall methodology is generic enough to allo-

cate VMs from a cloud provider irrespective of the ap-

plication of the scientiﬁc workﬂow considered. Even

though, in this study, the uncertain parameters are

present in the constraints alone, CCP technique is ef-

ﬁcient enough for handling uncertain objective func-

tion(s) also.

The rest of the paper is organized as follows: Sec-

tion 2 describes a brief review of the existing work

in resource allocation for cloud computing and opti-

mization under uncertainty. Section 3 illustrates the

mathematical model for deterministic and stochas-

tic optimization. In Section 4, the method is tested

against a speciﬁc application called nug22-sbb using

the VMs provided by Amazon. Finally, the work is

concluded in Section 5.

2 RELATED WORK

Resource allocation for scientiﬁc workﬂow tasks in

cloud is a challenging research problem. Many works

were proposed in the literature to ﬁnd an optimal

workﬂow schedule such that user requirements are

met. The authors (Mao and Humphrey, 2011) pre-

sented an auto-scaling mechanism to minimize cost

and meet application deadlines in cloud workﬂows.

The authors (Calheiros and Buyya, 2013) developed

an algorithm that used idle time of provisioned re-

sources and budget surplus to replicate tasks. For uti-

lizing the idle resources efﬁciently, the authors pre-

sented a workﬂow task replication strategy to miti-

gate performance variation effects of resources to sat-

isfy the soft deadline of workﬂow. The authors (Zeng

et al., 2015) proposed a Security-Aware and Budget-

Aware (SABA) scheduling scheme for optimizing the

make-span under both the security and budget con-

straints. On considering the security threats in cloud,

Multi-objective Optimization for Virtual Machine Allocation in Computational Scientiﬁc Workﬂow under Uncertainty

241

a Security and Cost Aware Scheduling (SCAS) mech-

anism was devised for scientiﬁc workﬂow applica-

tions with heterogeneous tasks (Li et al., 2016). How-

ever, the single objective workﬂow scheduling meth-

ods fail to provide diverse solutions for cloud users to

choose.

Most of these studies do not have a global opti-

mization technique in place which is able to produce

a near-optimal solution. Instead, they relied on task

level optimization and thus failed to take advantage

of the entire workﬂow structure and characteristics

to generate a globally optimal solution. However, a

few other literatures applied global optimization al-

gorithms to solve the workﬂow VM allocation prob-

lem. For instance, Pandey et al. (Pandey et al., 2010)

proposed a Particle Swarm Optimization (PSO) based

algorithm to minimize the execution cost of a sin-

gle workﬂow while balancing the task load on the

available resources. Aiming at shortcomings in ex-

isting scheduling methods for batch processing work-

ﬂow, Wen et al. (Wen et al., 2012) attempted to in-

vestigate the optimization problem for grouping and

scheduling multiple activity instances in batch pro-

cessing workﬂow. In (Rodriguez and Buyya, 2014),

the authors used PSO algorithm to minimize over-

all workﬂow execution cost while meeting the dead-

line constraint in clouds. It was devised to meet the

users’ requirements and to incorporate the basic prin-

ciples of cloud computing. Nonetheless, in these min-

imal amount of work on global optimization tech-

niques, despite the known fact that the objectives con-

sidered are multi-objective in nature, the VM alloca-

tion model was formulated as a single objective. This

makes the corresponding optimization study less ef-

fective and most of the times, many feasible regions

remain unexplored.

In some of the aforementioned works present in

literature, multiple objectives of workﬂow schedul-

ing were also considered. In (Fard et al., 2014), the

list scheduling heuristic for multi-objective workﬂow

scheduling were developed in cloud-based comput-

ing scenario and heterogeneous distributed computing

system, respectively. Zhu et al. (Zhu et al., 2015) de-

veloped a multi-objective optimization method which

is based on evolutionary algorithm to address the

workﬂow scheduling issue in cloud computing envi-

ronment. However, the research in this domain is still

in progress and the existing algorithms may not be di-

rectly applied in the cloud environment due to their

complex nature. Hence, there is a need to formulate

the VM allocation or scheduling workﬂow model as

a multi-objective problem that is tractable and easily

scalable such that the optimal resources which sat-

isfy the users’ requirements are identiﬁed using global

Table 1: Notations Description for the Model Parameters.

R Total processing requirement

M Total memory requirement

S Total storage requirement

C Set of VM types

T Time horizon

Maximum available VMs in each type

N Set of VM instances in each type (|N|=N

)

Cost of renting VM type c for a time period

Storage capacity of VM type c

Memory capacity of VM type c

Processing capacity of VM type c

Last time period a VM has been allocated

cjt

(

1, if VM j of type c is allocated at time t

0, otherwise

optimization algorithms. Moreover, in the existing

works, limited number of works have considered the

uncertainties arising in the customer demand require-

ments, which is often the realistic case (Calzarossa

et al., 2019; Tchernykh et al., 2019). Due to the ran-

dom nature of these uncertain parameters, the multi-

objective optimization formulation of VM allocation

converts into uncertain or stochastic form. Over the

last few years, uncertainty handling is a major do-

main of research as it helps in practical implemen-

tation of obtained solutions. Therefore, in this paper,

multi-objective optimization problem of VM alloca-

tion model has been formulated on considering the

uncertainties and solved using global optimization al-

gorithm.

3 OPTIMAL RESOURCE

ALLOCATION UNDER

UNCERTAINTY

In this section, we discuss the deterministic and

stochastic optimization model for resource allocation.

Further, we state the Chance Constrained Program-

ming technique for solving the stochastic optimiza-

tion problem.

3.1 Deterministic Multi-objective

Optimization Model

Prior to problem deﬁnition, the notations used for de-

scribing the model in the further sections of the pa-

per are presented in Table 1. Let us consider a set

of consumer requirements which can be further cate-

gorized as application and non-application based re-

quirements. The former category consists of the fol-

CLOSER 2021 - 11th International Conference on Cloud Computing and Services Science

242

lowing demand requirements: (i) total processing re-

quirement R, (ii) total memory requirement M, and

(iii) total storage requirement S. Contrary to this,

the later category which is non-application based, in-

cludes, the upper limits on a) budget B and b) ex-

ecution time T

, associated with the deployment of

workﬂow applications in different resources. In this

paper, one main cost component, which is the pur-

chasing cost of varying VM types is considered along-

with overall execution time for completion of a spe-

ciﬁc application. The application is executed faster or

in other words, T

is relatively lowered if high power

VMs are used. However, using high power VMs in-

creases the service cost as VM cost increases with in-

crease in computational power. Therefore, there ex-

ists an evident trade-off between the two mentioned

objectives. Hence, the goal of this study is to simul-

taneously minimize the purchasing cost of VMs and

minimize T

, where the decision variables comprise

the number of VMs of each conﬁguration that are of-

fered by the provider, the time of usage of each of

these VMs and execution time (T

acts as both de-

cision variable as well as objective function). Addi-

tionally, the formulated multi-objective optimization

problem contains some constraints based on applica-

tion and non-application-based users’ requirements.

The aforementioned constrained multi-objective

optimization formulation is termed as optimal VM

allocation problem and the same is mathematically

represented using the equations shown below (Heilig

et al., 2016; Coutinho et al., 2015).

min

∑

c∈C

∑

j∈N

∑

t∈T

cjt

(1)

min T

(2)

subject to

∑

c∈C

∑

j∈N

∑

t∈T

cjt

≤ B (3)

∑

c∈C

∑

j∈N

cjt

≥ Sx

∀t ∈ T, c

∈ C, j

∈ N (4)

∑

c∈C

∑

j∈N

cjt

≥ Mx

∀t ∈ T, c

∈ C, j

∈ N (5)

∑

c∈C

∑

j∈N

∑

t∈T

cjt

≥ R (6)

∑

j∈N

cjt

≤ N

∀t ∈ T, c ∈ C (7)

≥ tx

cjt

∀t ∈ T, c ∈ C, j ∈ N (8)

cj(t+1)

≤ x

cjt

∀t ∈ {1, 2, .., |T| − 1}, c ∈ C, j ∈ N

(9)

c(j+1)t

≤ x

cjt

∀t ∈ T, c ∈ C, j ∈ {1, 2, .., N

− 1}

(10)

∈ Z

and x

cjt

∈ {0, 1} ∀t ∈ T, c ∈ C, j ∈ N

(11)

Since all the decision variables of the VM alloca-

tion model are restricted to integers, this is an Integer

Linear Programming problem (ILP), which is usually

NP-Hard (Madni et al., 2016). The constraints in Eqs.

3 to 11 imply the following:

Eq. 3 ensures that the purchasing cost of different

VM types does not surpass the budget. Eq. 4 en-

sures that the purchased storage capacity is sufﬁcient

enough for satisfying the storage requirement (S) at

each time period. Eq. 5 ensures that the purchased

memory capacity is sufﬁcient enough for satisfying

the storage requirement (M) at each time period. Eq.

6 ensures that the purchased processing capacity is

sufﬁcient enough for satisfying the overall process-

ing demand (R). Eq. 7 guarantees that the number

of VMs used does not exceed the maximum number

VMs available for each VM type at each time period.

We assume that the maximum number of VMs avail-

able is same for all VM types. Eq. 8 states the proper-

ties of execution time. Eq. 9 ensures that the resource

or VM of a speciﬁc type which is selected at time t+1

is also selected at time t. Eq. 10 ensures that (j + 1)

VM is used only if j

VM is assigned.

The decision variables comprise two components

of which, one of them x

cjt

is binary and the other T

is integral in nature (as shown in Eq. 11). The model

can be extended to multi-cloud environment and other

cost components can also be considered. However,

the inclusion of those additional components does not

affect the implementation of the proposed framework

that will be discussed below. The methodology can be

scaled to the extended version of the model as well.

3.2 Stochastic Optimization Model

In the deterministic VM allocation model (Eqs. 1 to

11), the customer requirements which are categorized

as application and non-application-based, may not al-

ways be ﬁxed. For instance, in order to execute a data

mining task, which is computationally intensive, the

user may purchase 3 VMs of type 1 and 2 VMs of

type 2 for a period of 20 hours, hoping that the task

would be completed within that time. However, after

completion of around 50% of the task, the customer

might change the requirements, either increase or de-

crease the VMs of each type, or even request for a new

type of VM, owing to the computational speed and the

status of the usage of resources deployed so far. This

ﬂexible nature of users’ requirements not only enables

them to choose sufﬁcient and appropriate VMs for

faster completion of the task but also helps in elim-

inating the unnecessary cost of resources. It is to be

Multi-objective Optimization for Virtual Machine Allocation in Computational Scientiﬁc Workﬂow under Uncertainty

243

noted that in most of the cases, the inputs provided by

the user keeps varying and are hence termed as uncer-

tain variables.

On considering the aforementioned uncertainties,

the stochastic (or uncertain) optimization formulation

is shown as follows:

min

∑

c∈C

∑

j∈N

∑

t∈T

cjt

(12)

min T

(13)

subject to

∑

c∈C

∑

j∈N

∑

t∈T

cjt

≤ B (14)

∑

c∈C

∑

j∈N

cjt

≥ ξ(1)x

∀t ∈ T, c

∈ C, j

∈ N

(15)

∑

c∈C

∑

j∈N

cjt

≥ ξ(2)x

∀t ∈ T, c

∈ C, j

∈ N

(16)

∑

c∈C

∑

j∈N

∑

t∈T

cjt

≥ ξ(3) (17)

∑

j∈N

cjt

≤ N

∀t ∈ T, c ∈ C (18)

≥ tx

cjt

∀t ∈ T, c ∈ C, j ∈ N (19)

cj(t+1)

≤ x

cjt

∀t ∈ {1, 2, .., |T| − 1}, c ∈ C, j ∈ N

(20)

c(j+1)t

≤ x

cjt

∀t ∈ T, c ∈ C, j ∈ {1, 2, .., N

− 1}

(21)

∈ Z

and x

cjt

∈ {0, 1} ∀t ∈ T, c ∈ C, j ∈ N

(22)

Similar to the deterministic case, the above formula-

tion is a constrained integer linear programming prob-

lem, where the decision variables remain unchanged.

Nonetheless, the model now consists of three uncer-

tain parameters that are present in the Eqs. 15 to 17

and are denoted by the vector ξ = [ξ(1), ξ(2), ξ(3)].

Contrary to this, the objective functions (Eqs. 12 and

13) remain unchanged as they are independent of the

three uncertain parameters.

3.3 Chance Constrained Programming

As mentioned in the introduction section, CCP is

emerging as an efﬁcient and tractable approach for

handling stochastic optimization problems. In CCP,

the constraints need to be satisﬁed with a predeﬁned

probability value, say p, but not necessarily for all oc-

casions. Since the uncertain parameters are present in

the constraints, as shown in Eqs. 15 to 17, there is no

guarantee that they will be satisﬁed all the time due

to the varying realizations of bounded uncertain pa-

rameters. As a result, a certain probability value of

constraint satisfaction is associated with each of the

uncertain constraints.

Let us consider a standard optimization formula-

tion with uncertain parameter vector ξ and decision

variable vector x as shown in Eq. 23. On application

of CCP framework, this stochastic optimization can

be represented using Eq. 24 (Mitra, 2013).

min

{f(x)|g(x, ξ) ≥ 0} (23)

min

{f(x)|P(g(x, ξ) ≥ 0) ≥ p} (24)

where, f(x) and g(x) denote the objective function and

constraint, respectively. In Eq. 24, P represents the

measure of probability which varies between 0 to 1.

Higher the p value, more reliable yet more conser-

vative is the solution. The feasible decision space

is progressively lowered as the probability value ap-

proaches unity. Since the constraints need to be sat-

isﬁed individually, rather than joint constraints in the

VM allocation model, the mentioned CCP formula-

tion can be implemented separately or individually to

all the uncertain constraints.

Prior to estimation of the probability values, we

need to know the probability distribution of the de-

mand requirements. In this work, for simplicity, we

assume the three uncertain demand requirements to

follow normal distribution; however, CCP can be eas-

ily extended to other types of distributions as well.

Another important point to be noted is that the deci-

sion variables and the uncertain parameters are sepa-

rable in the considered VM allocation model. Owing

to these aspects, the stochastic optimization problem

in Eq. 24 is converted into equivalent deterministic

optimization problem shown as follows:

min

{f(x)|P(

g(x) ≥ ξ) ≥ p} (25)

=⇒ min

{f(x)|

g(x) ≥

ξ} (26)

=⇒ min

{f(x)|

g(x) ≥

ξ + q

} (27)

where,

ξ and σ

represent the mean and standard de-

viation values for the uncertain parameter ξ. q

de-

notes the p

quantile of the standard normal distribu-

tion with mean = 0 and standard deviation = 1 (for in-

stance, when p = 0.97, q

corresponds to q

0.97

, which

is equal to 2). The second term in the right-hand side

of the constraint in Eq. 27 (q

) corrects the nomi-

nal requirement of demand and delivers robustness of

the generated optimal allocation of resources under

uncertain situations. In general, CCP technique also

works if the set of decision variables and uncertain

parameters are non-separable (Mitra, 2013). Since

the problem is now converted into deterministic form,

any classical or evolutionary optimization algorithm

can be used for solving it.

CLOSER 2021 - 11th International Conference on Cloud Computing and Services Science

244

Table 2: Speciﬁcations of VMs in Amazon EC2.

VM Type

($/hr)

(GB)

(MFLOPS)

c3.large 0.105 32 3.75 8800

c3.xlarge 0.210 80 7.5 17600

c3.2xlarge 0.420 160 15 35200

c3.3xlarge 0.840 320 30 70400

c3.4xlarge 1.680 640 60 140800

4 RESULTS AND DISCUSSIONS

An application or problem instance called nug22-sbb,

which is computationally intensive, has been consid-

ered from (Heilig et al., 2016). The following re-

source requirements are considered from the user for

this speciﬁc application as presented by Heilig et. al.

(Heilig et al., 2016): M = 77 GB, S = 51 GB, R =

5067533 GFLOPS (per time period t), T = 12 hrs,

B = 343$. The cloud provider is ﬁxed as Amazon

EC2

, where a diverse range of resources is offered

for proper execution of the application. In this study,

ﬁve types of VMs are chosen as the probable set of

resources that possess the speciﬁcation as shown in

Table 2 (Li et al., 2016). The maximum number of

VMs is considered to be the same (N

= 30) for all

types of conﬁgurations.

4.1 Deterministic MOOP

For the described application with speciﬁed user re-

quirements, the objective of the study is to identify

the optimal conﬁguration of VMs at each time period,

from the set of VMs provided by Amazon EC2. To

accomplish this, the constrained two-objective opti-

mization problem as presented in Eqs. 1 – 11 is solved

using a well-known evolutionary optimization algo-

rithm called NSGA-II which has the ability to han-

dle ILP problems. Since the classical optimization

algorithms are found to be less efﬁcient for generat-

ing the entire set of solutions while solving MOOPs,

the evolutionary optimizers, which have the capability

of providing near-global-optimal solutions are chosen

in this paper (Deb, 2015). Being a population based

evolutionary optimizer, NSGA-II generates all the op-

timal solutions in a single simulation run, which are

also called as Pareto-Optimal (PO) solutions (Deb,

2015). On the other hand, classical optimization algo-

rithms often convert the MOOP into single objective

optimization problem and then solve multiple times

for generating the entire Pareto front. However, this

is a computationally extensive and inefﬁcient process.

http://aws.amazon.com/ec2/

Table 3: Optimal VM Conﬁguration obtained for a Solution

chosen from the Deterministic Pareto Front.

Time Periods

→

1-3 4-6 7-9 10-12

c3.large 1 9 5 0

c3.xlarge 2 8 2 0

c3.2xlarge 5 7 0 0

c3.3xlarge 1 3 0 0

c3.4xlarge 0 0 0 0

Figure 1: Pareto front obtained on solving the deterministic

VM allocation optimization problem using NSGA-II.

Consequently, the deterministic MOOP which is an

ILP, is solved using binary coded NSGA-II with num-

ber of populations = 500, number of generations =

500, crossover probability = 0.9 and mutation prob-

ability = 0.01. The obtained two dimensional Pareto

front is presented (in the objective space) in Fig. 1.

It is observed that even though the maximum allow-

able execution time is 12 hours, the application was

able to complete it by 10 hours. (maximum value of

), with the purchasing cost remaining low and well

within the budget limit. From the obtained PO solu-

tions, the cloud broker may choose any one solution

based on a higher order information such as, select the

resource that is situated closer to the users’ location

(might help in reducing communication cost). For il-

lustration purpose, one of the PO solutions has been

selected and its corresponding decision variables are

presented in Table 3 for each speciﬁc type of VM pro-

vided by Amazon. The number of VMs are reported

for a period of three hours each. It is observed that a

total of 43 VMs were required for executing the con-

sidered scientiﬁc workﬂow. Moreover, the number of

VMs chosen at each time instance (represented for a

three hour window in Table 3) do not follow any spe-

ciﬁc pattern and one of the conﬁgurations of VMs,

that is, c3.4xlarge VMs, were not allocated in the en-

tire time horizon. This shows that optimal VM allo-

cation is a non-trivial exercise.

Multi-objective Optimization for Virtual Machine Allocation in Computational Scientiﬁc Workﬂow under Uncertainty

245

Figure 2: Pareto fronts obtained on solving the stochastic

VM allocation problem with varying probability of con-

straint satisfaction using CCP followed by NSGA-II.

4.2 Stochastic MOOP using CCP and

NSGA–II

The same application or problem instance has been

analyzed in this section but with the inclusion of un-

certainties in the four demand requirements from the

user. The deterministic values that were used pre-

viously (in section III.A) are allowed to deviate by

20% for obtaining the bounds on the uncertain pa-

rameters. In practical scenario, these bounds will be

usually provided by the user or cloud broker. Now,

it is assumed that the three uncertain parameters fol-

low normal distribution and the probability of con-

straint satisfaction (p) is set to 0.75. Subsequently,

CCP was applied for solving the stochastic optimiza-

tion problem of VM allocation (Eqs. 12 – 22), which

is again an ILP and multi-objective. On converting

this stochastic formulation into equivalent determin-

istic optimization problem using Eqs. 24 to 27 and

solving it using NSGA-II, the two dimensional Pareto

optimal front is obtained as shown in Fig. 2. On com-

parison with the deterministic solution, it is observed

that the solution quality is improved with respect to

both the objective function values. Considering one

of the PO solutions, the attained decision variables

are presented in Table 4, which correspond to each

type of VM over entire time horizon (represented for

a three hour window). In this case, a total of 30 VMs

were required for executing the considered scientiﬁc

workﬂow, which is less in number as compared to de-

terministic case. Further, c3.2xlarge VMs were not al-

located in the entire time horizon and c3.4xlarge VMs

were allocated as opposed to deterministic solutions,

which implies that consideration of uncertainty plays

an important role in the selection of optimal VMs.

Additionally, in order to study the effect of prob-

ability of constraint satisfaction (p), the value of p is

varied from 0.75 to 1 and the corresponding solutions

Table 4: Optimal VM Conﬁguration obtained for a Solution

chosen from the Stochastic Pareto Front (p=0.75).

Time Periods

→

1-3 4-6 7-9 10-12

c3.large 3 7 0 0

c3.xlarge 8 1 3 0

c3.2xlarge 0 0 0 0

c3.3xlarge 0 2 3 0

c3.4xlarge 0 0 3 0

are presented in Fig. 2. In practise, a broker can de-

cide on the p value based on the SLA agreed with the

client. It is observed that as the p value increases, the

solution quality varies, sometimes deteriorates as well

but then the reliability of the solution is more. How-

ever, choosing a too high value of p might lead to con-

servative solutions and on the other hand, a smaller p

value is also not suggestable.

5 CONCLUSIONS

This paper address the problem of VM allocation

for scientiﬁc workﬂow considering multi-objectives.

The aim is to minimize the purchasing cost and ex-

ecution time while satisfying the uncertainties in the

users’ demand requirements. A constrained stochas-

tic multi-objective optimization problem has been for-

mulated, and solved using Chance Constrained Pro-

gramming (CCP). The problem is converted into its

deterministic equivalent and solved using NSGA-II.

The results imply that the deterministic optimization

solutions are inferior to those obtained for stochastic

formulation, which might be apparently due to the in-

creased feasible region. Further, the effect of varying

the level of constraint satisfaction in CCP is studied.

Future studies in this direction of research could

be the following: (i) consideration of joint constraints

in both the deterministic as well as stochastic opti-

mization model, (ii) implementation of frequentist ap-

proach for calculation of probability values in CCP,

(iii) consideration of multi-cloud and other cost com-

ponents in the model.

REFERENCES

Aslam, S., ul Islam, S., Khan, A., Ahmed, M., Akhundzada,

A., and Khan, M. K. (2017). Information collection

centric techniques for cloud resource management:

Taxonomy, analysis and challenges. Journal of Net-

work and Computer Applications, 100:80–94.

Calheiros, R. N. and Buyya, R. (2013). Meeting deadlines

of scientiﬁc workﬂows in public clouds with tasks

CLOSER 2021 - 11th International Conference on Cloud Computing and Services Science

246

replication. IEEE Transactions on Parallel and Dis-

tributed Systems, 25(7):1787–1796.

Calzarossa, M. C., Della Vedova, M. L., and Tessera, D.

(2019). A methodological framework for cloud re-

source provisioning and scheduling of data parallel

applications under uncertainty. Future Generation

Computer Systems, 93:212–223.

Coutinho, R. d. C., Drummond, L. M., Frota, Y., and

de Oliveira, D. (2015). Optimizing virtual machine al-

location for parallel scientiﬁc workﬂows in federated

clouds. Future Generation Computer Systems, 46:51–

68.

Deb, K. (2015). Multi-objective evolutionary algorithms.

In Springer handbook of computational intelligence,

pages 995–1015. Springer.

Diwekar, U. M. (2020). Introduction to applied optimiza-

tion, volume 22. Springer Nature.

Fard, H. M., Prodan, R., and Fahringer, T. (2014). Multi-

objective list scheduling of workﬂow applications in

distributed computing infrastructures. Journal of Par-

allel and Distributed Computing, 74(3):2152–2165.

Ferdaus, M. H., Murshed, M., Calheiros, R. N., and Buyya,

R. (2017). An algorithm for network and data-aware

placement of multi-tier applications in cloud data cen-

ters. Journal of Network and Computer Applications,

98:65–83.

Heilig, L., Lalla-Ruiz, E., and Voß, S. (2016). A cloud bro-

kerage approach for solving the resource management

problem in multi-cloud environments. Computers &

Industrial Engineering, 95:16–26.

Heilig, L., Lalla-Ruiz, E., and Voß, S. (2020). Model-

ing and solving cloud service purchasing in multi-

cloud environments. Expert systems with applica-

tions, 147:113165.

Hu, H., Li, Z., Hu, H., Chen, J., Ge, J., Li, C., and Chang,

V. (2018). Multi-objective scheduling for scientiﬁc

workﬂow in multicloud environment. Journal of Net-

work and Computer Applications, 114:108–122.

Li, Z., Ge, J., Yang, H., Huang, L., Hu, H., Hu, H., and

Luo, B. (2016). A security and cost aware scheduling

algorithm for heterogeneous tasks of scientiﬁc work-

ﬂow in clouds. Future Generation Computer Systems,

65:140–152.

Madni, S. H. H., Abd Latiff, M. S., Coulibaly, Y., et al.

(2016). Resource scheduling for infrastructure as a

service (iaas) in cloud computing: Challenges and op-

portunities. Journal of Network and Computer Appli-

cations, 68:173–200.

Mao, M. and Humphrey, M. (2011). Auto-scaling to min-

imize cost and meet application deadlines in cloud

workﬂows. In SC’11: Proceedings of 2011 Interna-

tional Conference for High Performance Computing,

Networking, Storage and Analysis, pages 1–12. IEEE.

Mitra, K. (2013). Chance constrained programming to han-

dle uncertainty in nonlinear process models. Multi-

Objective Optimization in Chemical Engineering: De-

velopments and Applications, pages 183–215.

Mohammadi, S., Pedram, H., and PourKarimi, L. (2018).

Integer linear programming-based cost optimization

for scheduling scientiﬁc workﬂows in multi-cloud

environments. The Journal of Supercomputing,

74(9):4717–4745.

Ning, C. and You, F. (2019). Optimization under uncer-

tainty in the era of big data and deep learning: When

machine learning meets mathematical programming.

Computers & Chemical Engineering, 125:434–448.

Odetayo, B., Kazemi, M., MacCormack, J., Rosehart,

W. D., Zareipour, H., and Seiﬁ, A. R. (2018). A

chance constrained programming approach to the in-

tegrated planning of electric power generation, natural

gas network and storage. IEEE Transactions on Power

Systems, 33(6):6883–6893.

Pandey, S., Wu, L., Guru, S. M., and Buyya, R. (2010).

A particle swarm optimization-based heuristic for

scheduling workﬂow applications in cloud computing

environments. In 2010 24th IEEE international con-

ference on advanced information networking and ap-

plications, pages 400–407. IEEE.

Ramamurthy, A., Saurabh, S., Gharote, M., and Lodha, S.

(2020). Selection of cloud service providers for host-

ing web applications in a multi-cloud environment.

In 2020 IEEE International Conference on Services

Computing (SCC), pages 202–209. IEEE.

Rodriguez, M. A. and Buyya, R. (2014). Deadline based re-

source provisioningand scheduling algorithm for sci-

entiﬁc workﬂows on clouds. IEEE transactions on

cloud computing, 2(2):222–235.

Tchernykh, A., Schwiegelsohn, U., Talbi, E.-g., and

Babenko, M. (2019). Towards understanding uncer-

tainty in cloud computing with risks of conﬁdentiality,

integrity, and availability. Journal of Computational

Science, 36:100581.

Wang, X. and Ning, Y. (2017). Uncertain chance-

constrained programming model for project schedul-

ing problem. Journal of the operational research so-

ciety, pages 1–9.

Wen, Y., Chen, Z., Chen, T., Liu, J., and Kang, G. (2012). A

particle swarm optimization algorithm for batch pro-

cessing workﬂow scheduling. In 2012 Second Inter-

national Conference on Cloud and Green Computing,

pages 645–649. IEEE.

Zeng, L., Veeravalli, B., and Zomaya, A. Y. (2015). An

integrated task computation and data management

scheduling strategy for workﬂow applications in cloud

environments. Journal of Network and Computer Ap-

plications, 50:39–48.

Zhu, Z., Zhang, G., Li, M., and Liu, X. (2015). Evolu-

tionary multi-objective workﬂow scheduling in cloud.

IEEE Transactions on parallel and distributed Sys-

tems, 27(5):1344–1357.

Multi-objective Optimization for Virtual Machine Allocation in Computational Scientiﬁc Workﬂow under Uncertainty

247