Multi-objective Optimization for Virtual Machine Allocation in
Computational Scientific Workflow under Uncertainty
Arun Ramamurthy
1
, Priyanka Pantula
2
, Mangesh Gharote
1
, Kishalay Mitra
2
and Sachin Lodha
1
1
TCS Research and Innovation, Tata Consultancy Services, India
2
Indian Institute of Technology, Hyderabad, India
Keywords:
Cloud Computing, Scientific Workflow, Resource Allocation, Multi-objective Optimization, NSGA-II,
Chance Constrained Programming.
Abstract:
Providing resources and services from various cloud providers is now an increasingly promising paradigm.
Workflow applications are becoming increasingly computation-intensive or data-intensive, with resource al-
location being maintained in terms of pay per usage. In this paper, a multi-objective optimization study for
scientific workflow in a cloud environment is proposed. The aim is to minimize execution time and purchas-
ing cost simultaneously while satisfying the demand requirements of customers. The uncertainties present in
the model are identified and handled using a well-known technique called Chance Constrained Programming
(CCP) for real-world implementation. The model is solved using the Non-dominated Sorting Genetic Algo-
rithm II (NSGA-II). This comprehensive study shows that the solutions obtained on considering uncertainties
vary from the deterministic case. Based on the probability of constraint satisfaction, the objective functions
improve but at the cost of reliability of the solution.
1 INTRODUCTION
Cloud computing has emerged as a popular paradigm,
where computing resources are provided based on the
demand raised by the users in terms of pay per use
pricing mechanism (Aslam et al., 2017; Ferdaus et al.,
2017). In the cloud platform, often a data center
manages large-scale Virtual Machines (VMs), which
are useful for execution of computational intensive
tasks. In the present commercial environment, a di-
verse range of VM types is provided with varying
prices by each cloud provider. The user has to select
the best resources for execution of a particular task.
Therefore, providing optimal resources and services
from cloud providers is a vital paradigm of research
(Mohammadi et al., 2018; Hu et al., 2018; Heilig
et al., 2020; Ramamurthy et al., 2020).
In the cloud environment, the optimal VM alloca-
tion for scientific workflow is formulated by consid-
ering two main aspects: (i) cost components such as
purchasing cost, resource sharing cost and so on. (ii)
execution time, along-with meeting users’ require-
ment. The objective function varies linearly with re-
spect to both these components and in literature, this
allocation problem is known to be NP-hard (Madni
et al., 2016). The decision variables here include
the number of VMs allocated, configuration of VMs
(provided by the cloud provider), the time at which a
VM is allocated and the total execution time. Besides
the challenges related in solving NP-hard problems,
in practical scenario, the user’s requirements such as
memory, storage capacity and so on, might be non-
deterministic. Often they may vary either due to the
dependence on the percentage of work completion,
uncertainties in task execution or inaccurate estima-
tion of requirements. Thus, there is a need to opti-
mally allocate VMs along-with simultaneous consid-
eration of cost components, execution time and un-
certain requirements, which are bounded rather than
a fixed value.
Under the uncertain situations, for the ease of han-
dling the optimization routine, most of the times, the
problems are assumed to be deterministic and solved
using deterministic optimization algorithms. How-
ever, such a study might lead to unrealistic solu-
tions or decisions under practical scenarios (Diwekar,
2020). To illustrate, while trying to minimize the
computation cost of an application under dynami-
cally changing demand of a resource, the determin-
istic demand-based cost might deteriorate the appli-
cation efficiency or increase the energy consumption
during the periods of over- and under- estimated val-
240
Ramamurthy, A., Pantula, P., Gharote, M., Mitra, K. and Lodha, S.
Multi-objective Optimization for Virtual Machine Allocation in Computational Scientific Workflow under Uncertainty.
DOI: 10.5220/0010453302400247
In Proceedings of the 11th International Conference on Cloud Computing and Services Science (CLOSER 2021), pages 240-247
ISBN: 978-989-758-510-4
Copyright
c
2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
ues of demand. As a result, over the past few years,
uncertainty handling during decision making has been
gaining importance in both the industrial as well as
academic sectors of research (Diwekar, 2020; Ning
and You, 2019).
Some of the well-known optimization under un-
certainty handling techniques include Stochastic Pro-
gramming, Chance Constrained Programming (CCP),
Robust Optimization, Expected Value Model and
Fuzzy Mathematical Programming (Diwekar, 2020).
Among these, CCP emerges as one of the popular
approaches for efficiently dealing with uncertainties.
It is applied to diverse domains of research includ-
ing the topics from scheduling, process modelling
and optimization, process design and so on (Odetayo
et al., 2018; Wang and Ning, 2017). In CCP, the con-
straints need to be necessarily satisfied with a pre-
defined value of probability, rather than all the real-
izations of uncertain parameters. However, the reli-
ability of the solution is dependent on the probabil-
ity of constraint satisfaction. In order to make such a
complicated probabilistic formulation more tractable,
the CCP formulation is converted to an equivalent
deterministic formulation, which is then dealt using
any deterministic optimization techniques. In optimal
VM allocation problem, since the uncertain parame-
ters (users’ requirements) are linear in the constraints,
the CCP approach can be applied by implementing
coordinate transformation or by calculating classical
probability values (Mitra, 2013). Additionally, the
problem size in CCP is manageable even when the
number of uncertain parameters increases.
Despite the diverse range of applications of CCP,
in VM allocation problem, the uncertainties arising
are rarely addressed in literature. It might be due to
the presence of hard deterministic optimization prob-
lem. Apart from that, the deterministic VM allocation
optimization problem turns out to be multi-objective
in nature. Various trade-off can occur such as fast
implementation, low-cost resources, secured and reli-
able resources, less energy wastage and so on. How-
ever, in most of the existing works, the constrained
Multi-Objective Optimization Problem (MOOP) is
considered as a constrained single objective opti-
mization problem, which is proven to be a less ef-
ficient way of solving MOOPs since it needs to be
solved multiple times for achieving the complete set
of Pareto-Optimal (PO) solutions (Heilig et al., 2016).
Owing to the formulation of MOOP in determinis-
tic case, the uncertain optimization problem also re-
sults in multiple objectives where some of the con-
straints remain uncertain. Therefore, there is a need
to solve the multi-objective optimization problem of
VM allocation efficiently along with consideration of
uncertain user requirements. Nevertheless, the prob-
lem is non-trivial as the resources need to be allocated
optimally at each time instance over the entire time
horizon along with identification of the right resource
configuration, in the presence of varying yet bounded
requirements.
In this paper, the aforementioned determinis-
tic MOOP of resource allocation is solved using
a well-known evolutionary optimization algorithm
called Non-dominated sorting Genetic Algorithm
II (NSGA II) that is capable of handling the con-
flicting objectives efficiently. Moreover, some of the
uncertain parameters present in the problem are iden-
tified and the optimization under uncertainty prob-
lem is solved using CCP. The solutions thus obtained
are analyzed and significant conclusions are drawn.
The overall methodology is generic enough to allo-
cate VMs from a cloud provider irrespective of the ap-
plication of the scientific workflow considered. Even
though, in this study, the uncertain parameters are
present in the constraints alone, CCP technique is ef-
ficient enough for handling uncertain objective func-
tion(s) also.
The rest of the paper is organized as follows: Sec-
tion 2 describes a brief review of the existing work
in resource allocation for cloud computing and opti-
mization under uncertainty. Section 3 illustrates the
mathematical model for deterministic and stochas-
tic optimization. In Section 4, the method is tested
against a specific application called nug22-sbb using
the VMs provided by Amazon. Finally, the work is
concluded in Section 5.
2 RELATED WORK
Resource allocation for scientific workflow tasks in
cloud is a challenging research problem. Many works
were proposed in the literature to find an optimal
workflow schedule such that user requirements are
met. The authors (Mao and Humphrey, 2011) pre-
sented an auto-scaling mechanism to minimize cost
and meet application deadlines in cloud workflows.
The authors (Calheiros and Buyya, 2013) developed
an algorithm that used idle time of provisioned re-
sources and budget surplus to replicate tasks. For uti-
lizing the idle resources efficiently, the authors pre-
sented a workflow task replication strategy to miti-
gate performance variation effects of resources to sat-
isfy the soft deadline of workflow. The authors (Zeng
et al., 2015) proposed a Security-Aware and Budget-
Aware (SABA) scheduling scheme for optimizing the
make-span under both the security and budget con-
straints. On considering the security threats in cloud,
Multi-objective Optimization for Virtual Machine Allocation in Computational Scientific Workflow under Uncertainty
241
a Security and Cost Aware Scheduling (SCAS) mech-
anism was devised for scientific workflow applica-
tions with heterogeneous tasks (Li et al., 2016). How-
ever, the single objective workflow scheduling meth-
ods fail to provide diverse solutions for cloud users to
choose.
Most of these studies do not have a global opti-
mization technique in place which is able to produce
a near-optimal solution. Instead, they relied on task
level optimization and thus failed to take advantage
of the entire workflow structure and characteristics
to generate a globally optimal solution. However, a
few other literatures applied global optimization al-
gorithms to solve the workflow VM allocation prob-
lem. For instance, Pandey et al. (Pandey et al., 2010)
proposed a Particle Swarm Optimization (PSO) based
algorithm to minimize the execution cost of a sin-
gle workflow while balancing the task load on the
available resources. Aiming at shortcomings in ex-
isting scheduling methods for batch processing work-
flow, Wen et al. (Wen et al., 2012) attempted to in-
vestigate the optimization problem for grouping and
scheduling multiple activity instances in batch pro-
cessing workflow. In (Rodriguez and Buyya, 2014),
the authors used PSO algorithm to minimize over-
all workflow execution cost while meeting the dead-
line constraint in clouds. It was devised to meet the
users’ requirements and to incorporate the basic prin-
ciples of cloud computing. Nonetheless, in these min-
imal amount of work on global optimization tech-
niques, despite the known fact that the objectives con-
sidered are multi-objective in nature, the VM alloca-
tion model was formulated as a single objective. This
makes the corresponding optimization study less ef-
fective and most of the times, many feasible regions
remain unexplored.
In some of the aforementioned works present in
literature, multiple objectives of workflow schedul-
ing were also considered. In (Fard et al., 2014), the
list scheduling heuristic for multi-objective workflow
scheduling were developed in cloud-based comput-
ing scenario and heterogeneous distributed computing
system, respectively. Zhu et al. (Zhu et al., 2015) de-
veloped a multi-objective optimization method which
is based on evolutionary algorithm to address the
workflow scheduling issue in cloud computing envi-
ronment. However, the research in this domain is still
in progress and the existing algorithms may not be di-
rectly applied in the cloud environment due to their
complex nature. Hence, there is a need to formulate
the VM allocation or scheduling workflow model as
a multi-objective problem that is tractable and easily
scalable such that the optimal resources which sat-
isfy the users’ requirements are identified using global
Table 1: Notations Description for the Model Parameters.
R Total processing requirement
M Total memory requirement
S Total storage requirement
C Set of VM types
T Time horizon
N
M
Maximum available VMs in each type
N Set of VM instances in each type (|N|=N
M
)
v
c
Cost of renting VM type c for a time period
s
c
Storage capacity of VM type c
m
c
Memory capacity of VM type c
r
c
Processing capacity of VM type c
T
E
Last time period a VM has been allocated
x
cjt
(
1, if VM j of type c is allocated at time t
0, otherwise
optimization algorithms. Moreover, in the existing
works, limited number of works have considered the
uncertainties arising in the customer demand require-
ments, which is often the realistic case (Calzarossa
et al., 2019; Tchernykh et al., 2019). Due to the ran-
dom nature of these uncertain parameters, the multi-
objective optimization formulation of VM allocation
converts into uncertain or stochastic form. Over the
last few years, uncertainty handling is a major do-
main of research as it helps in practical implemen-
tation of obtained solutions. Therefore, in this paper,
multi-objective optimization problem of VM alloca-
tion model has been formulated on considering the
uncertainties and solved using global optimization al-
gorithm.
3 OPTIMAL RESOURCE
ALLOCATION UNDER
UNCERTAINTY
In this section, we discuss the deterministic and
stochastic optimization model for resource allocation.
Further, we state the Chance Constrained Program-
ming technique for solving the stochastic optimiza-
tion problem.
3.1 Deterministic Multi-objective
Optimization Model
Prior to problem definition, the notations used for de-
scribing the model in the further sections of the pa-
per are presented in Table 1. Let us consider a set
of consumer requirements which can be further cate-
gorized as application and non-application based re-
quirements. The former category consists of the fol-
CLOSER 2021 - 11th International Conference on Cloud Computing and Services Science
242
lowing demand requirements: (i) total processing re-
quirement R, (ii) total memory requirement M, and
(iii) total storage requirement S. Contrary to this,
the later category which is non-application based, in-
cludes, the upper limits on a) budget B and b) ex-
ecution time T
E
, associated with the deployment of
workflow applications in different resources. In this
paper, one main cost component, which is the pur-
chasing cost of varying VM types is considered along-
with overall execution time for completion of a spe-
cific application. The application is executed faster or
in other words, T
E
is relatively lowered if high power
VMs are used. However, using high power VMs in-
creases the service cost as VM cost increases with in-
crease in computational power. Therefore, there ex-
ists an evident trade-off between the two mentioned
objectives. Hence, the goal of this study is to simul-
taneously minimize the purchasing cost of VMs and
minimize T
E
, where the decision variables comprise
the number of VMs of each configuration that are of-
fered by the provider, the time of usage of each of
these VMs and execution time (T
E
acts as both de-
cision variable as well as objective function). Addi-
tionally, the formulated multi-objective optimization
problem contains some constraints based on applica-
tion and non-application-based users’ requirements.
The aforementioned constrained multi-objective
optimization formulation is termed as optimal VM
allocation problem and the same is mathematically
represented using the equations shown below (Heilig
et al., 2016; Coutinho et al., 2015).
min
cC
jN
tT
v
c
x
cjt
(1)
min T
E
(2)
subject to
cC
jN
tT
v
c
x
cjt
B (3)
cC
jN
s
c
x
cjt
Sx
c
0
j
0
t
t T, c
0
C, j
0
N (4)
cC
jN
m
c
x
cjt
Mx
c
0
j
0
t
t T, c
0
C, j
0
N (5)
cC
jN
tT
r
c
x
cjt
R (6)
jN
x
cjt
N
M
t T, c C (7)
T
E
tx
cjt
t T, c C, j N (8)
x
cj(t+1)
x
cjt
t {1, 2, .., |T| 1}, c C, j N
(9)
x
c(j+1)t
x
cjt
t T, c C, j {1, 2, .., N
M
1}
(10)
T
E
Z
+
and x
cjt
{0, 1} t T, c C, j N
(11)
Since all the decision variables of the VM alloca-
tion model are restricted to integers, this is an Integer
Linear Programming problem (ILP), which is usually
NP-Hard (Madni et al., 2016). The constraints in Eqs.
3 to 11 imply the following:
Eq. 3 ensures that the purchasing cost of different
VM types does not surpass the budget. Eq. 4 en-
sures that the purchased storage capacity is sufficient
enough for satisfying the storage requirement (S) at
each time period. Eq. 5 ensures that the purchased
memory capacity is sufficient enough for satisfying
the storage requirement (M) at each time period. Eq.
6 ensures that the purchased processing capacity is
sufficient enough for satisfying the overall process-
ing demand (R). Eq. 7 guarantees that the number
of VMs used does not exceed the maximum number
VMs available for each VM type at each time period.
We assume that the maximum number of VMs avail-
able is same for all VM types. Eq. 8 states the proper-
ties of execution time. Eq. 9 ensures that the resource
or VM of a specific type which is selected at time t+1
is also selected at time t. Eq. 10 ensures that (j + 1)
th
VM is used only if j
th
VM is assigned.
The decision variables comprise two components
of which, one of them x
cjt
is binary and the other T
E
is integral in nature (as shown in Eq. 11). The model
can be extended to multi-cloud environment and other
cost components can also be considered. However,
the inclusion of those additional components does not
affect the implementation of the proposed framework
that will be discussed below. The methodology can be
scaled to the extended version of the model as well.
3.2 Stochastic Optimization Model
In the deterministic VM allocation model (Eqs. 1 to
11), the customer requirements which are categorized
as application and non-application-based, may not al-
ways be fixed. For instance, in order to execute a data
mining task, which is computationally intensive, the
user may purchase 3 VMs of type 1 and 2 VMs of
type 2 for a period of 20 hours, hoping that the task
would be completed within that time. However, after
completion of around 50% of the task, the customer
might change the requirements, either increase or de-
crease the VMs of each type, or even request for a new
type of VM, owing to the computational speed and the
status of the usage of resources deployed so far. This
flexible nature of users’ requirements not only enables
them to choose sufficient and appropriate VMs for
faster completion of the task but also helps in elim-
inating the unnecessary cost of resources. It is to be
Multi-objective Optimization for Virtual Machine Allocation in Computational Scientific Workflow under Uncertainty
243
noted that in most of the cases, the inputs provided by
the user keeps varying and are hence termed as uncer-
tain variables.
On considering the aforementioned uncertainties,
the stochastic (or uncertain) optimization formulation
is shown as follows:
min
cC
jN
tT
v
c
x
cjt
(12)
min T
E
(13)
subject to
cC
jN
tT
v
c
x
cjt
B (14)
cC
jN
s
c
x
cjt
ξ(1)x
c
0
j
0
t
t T, c
0
C, j
0
N
(15)
cC
jN
m
c
x
cjt
ξ(2)x
c
0
j
0
t
t T, c
0
C, j
0
N
(16)
cC
jN
tT
r
c
x
cjt
ξ(3) (17)
jN
x
cjt
N
M
t T, c C (18)
T
E
tx
cjt
t T, c C, j N (19)
x
cj(t+1)
x
cjt
t {1, 2, .., |T| 1}, c C, j N
(20)
x
c(j+1)t
x
cjt
t T, c C, j {1, 2, .., N
M
1}
(21)
T
E
Z
+
and x
cjt
{0, 1} t T, c C, j N
(22)
Similar to the deterministic case, the above formula-
tion is a constrained integer linear programming prob-
lem, where the decision variables remain unchanged.
Nonetheless, the model now consists of three uncer-
tain parameters that are present in the Eqs. 15 to 17
and are denoted by the vector ξ = [ξ(1), ξ(2), ξ(3)].
Contrary to this, the objective functions (Eqs. 12 and
13) remain unchanged as they are independent of the
three uncertain parameters.
3.3 Chance Constrained Programming
As mentioned in the introduction section, CCP is
emerging as an efficient and tractable approach for
handling stochastic optimization problems. In CCP,
the constraints need to be satisfied with a predefined
probability value, say p, but not necessarily for all oc-
casions. Since the uncertain parameters are present in
the constraints, as shown in Eqs. 15 to 17, there is no
guarantee that they will be satisfied all the time due
to the varying realizations of bounded uncertain pa-
rameters. As a result, a certain probability value of
constraint satisfaction is associated with each of the
uncertain constraints.
Let us consider a standard optimization formula-
tion with uncertain parameter vector ξ and decision
variable vector x as shown in Eq. 23. On application
of CCP framework, this stochastic optimization can
be represented using Eq. 24 (Mitra, 2013).
min
x
{f(x)|g(x, ξ) 0} (23)
min
x
{f(x)|P(g(x, ξ) 0) p} (24)
where, f(x) and g(x) denote the objective function and
constraint, respectively. In Eq. 24, P represents the
measure of probability which varies between 0 to 1.
Higher the p value, more reliable yet more conser-
vative is the solution. The feasible decision space
is progressively lowered as the probability value ap-
proaches unity. Since the constraints need to be sat-
isfied individually, rather than joint constraints in the
VM allocation model, the mentioned CCP formula-
tion can be implemented separately or individually to
all the uncertain constraints.
Prior to estimation of the probability values, we
need to know the probability distribution of the de-
mand requirements. In this work, for simplicity, we
assume the three uncertain demand requirements to
follow normal distribution; however, CCP can be eas-
ily extended to other types of distributions as well.
Another important point to be noted is that the deci-
sion variables and the uncertain parameters are sepa-
rable in the considered VM allocation model. Owing
to these aspects, the stochastic optimization problem
in Eq. 24 is converted into equivalent deterministic
optimization problem shown as follows:
min
x
{f(x)|P(
˜
g(x) ξ) p} (25)
= min
x
{f(x)|
˜
g(x)
ˆ
ξ} (26)
= min
x
{f(x)|
˜
g(x)
¯
ξ + q
p
σ
ξ
} (27)
where,
¯
ξ and σ
ξ
represent the mean and standard de-
viation values for the uncertain parameter ξ. q
p
de-
notes the p
th
quantile of the standard normal distribu-
tion with mean = 0 and standard deviation = 1 (for in-
stance, when p = 0.97, q
p
corresponds to q
0.97
, which
is equal to 2). The second term in the right-hand side
of the constraint in Eq. 27 (q
p
σ
ξ
) corrects the nomi-
nal requirement of demand and delivers robustness of
the generated optimal allocation of resources under
uncertain situations. In general, CCP technique also
works if the set of decision variables and uncertain
parameters are non-separable (Mitra, 2013). Since
the problem is now converted into deterministic form,
any classical or evolutionary optimization algorithm
can be used for solving it.
CLOSER 2021 - 11th International Conference on Cloud Computing and Services Science
244
Table 2: Specifications of VMs in Amazon EC2.
VM Type
v
c
($/hr)
s
c
(GB)
m
c
(GB)
r
c
(MFLOPS)
c3.large 0.105 32 3.75 8800
c3.xlarge 0.210 80 7.5 17600
c3.2xlarge 0.420 160 15 35200
c3.3xlarge 0.840 320 30 70400
c3.4xlarge 1.680 640 60 140800
4 RESULTS AND DISCUSSIONS
An application or problem instance called nug22-sbb,
which is computationally intensive, has been consid-
ered from (Heilig et al., 2016). The following re-
source requirements are considered from the user for
this specific application as presented by Heilig et. al.
(Heilig et al., 2016): M = 77 GB, S = 51 GB, R =
5067533 GFLOPS (per time period t), T = 12 hrs,
B = 343$. The cloud provider is fixed as Amazon
EC2
1
, where a diverse range of resources is offered
for proper execution of the application. In this study,
five types of VMs are chosen as the probable set of
resources that possess the specification as shown in
Table 2 (Li et al., 2016). The maximum number of
VMs is considered to be the same (N
M
= 30) for all
types of configurations.
4.1 Deterministic MOOP
For the described application with specified user re-
quirements, the objective of the study is to identify
the optimal configuration of VMs at each time period,
from the set of VMs provided by Amazon EC2. To
accomplish this, the constrained two-objective opti-
mization problem as presented in Eqs. 1 11 is solved
using a well-known evolutionary optimization algo-
rithm called NSGA-II which has the ability to han-
dle ILP problems. Since the classical optimization
algorithms are found to be less efficient for generat-
ing the entire set of solutions while solving MOOPs,
the evolutionary optimizers, which have the capability
of providing near-global-optimal solutions are chosen
in this paper (Deb, 2015). Being a population based
evolutionary optimizer, NSGA-II generates all the op-
timal solutions in a single simulation run, which are
also called as Pareto-Optimal (PO) solutions (Deb,
2015). On the other hand, classical optimization algo-
rithms often convert the MOOP into single objective
optimization problem and then solve multiple times
for generating the entire Pareto front. However, this
is a computationally extensive and inefficient process.
1
http://aws.amazon.com/ec2/
Table 3: Optimal VM Configuration obtained for a Solution
chosen from the Deterministic Pareto Front.
Time Periods
1-3 4-6 7-9 10-12
c3.large 1 9 5 0
c3.xlarge 2 8 2 0
c3.2xlarge 5 7 0 0
c3.3xlarge 1 3 0 0
c3.4xlarge 0 0 0 0
Figure 1: Pareto front obtained on solving the deterministic
VM allocation optimization problem using NSGA-II.
Consequently, the deterministic MOOP which is an
ILP, is solved using binary coded NSGA-II with num-
ber of populations = 500, number of generations =
500, crossover probability = 0.9 and mutation prob-
ability = 0.01. The obtained two dimensional Pareto
front is presented (in the objective space) in Fig. 1.
It is observed that even though the maximum allow-
able execution time is 12 hours, the application was
able to complete it by 10 hours. (maximum value of
T
E
), with the purchasing cost remaining low and well
within the budget limit. From the obtained PO solu-
tions, the cloud broker may choose any one solution
based on a higher order information such as, select the
resource that is situated closer to the users’ location
(might help in reducing communication cost). For il-
lustration purpose, one of the PO solutions has been
selected and its corresponding decision variables are
presented in Table 3 for each specific type of VM pro-
vided by Amazon. The number of VMs are reported
for a period of three hours each. It is observed that a
total of 43 VMs were required for executing the con-
sidered scientific workflow. Moreover, the number of
VMs chosen at each time instance (represented for a
three hour window in Table 3) do not follow any spe-
cific pattern and one of the configurations of VMs,
that is, c3.4xlarge VMs, were not allocated in the en-
tire time horizon. This shows that optimal VM allo-
cation is a non-trivial exercise.
Multi-objective Optimization for Virtual Machine Allocation in Computational Scientific Workflow under Uncertainty
245
Figure 2: Pareto fronts obtained on solving the stochastic
VM allocation problem with varying probability of con-
straint satisfaction using CCP followed by NSGA-II.
4.2 Stochastic MOOP using CCP and
NSGA–II
The same application or problem instance has been
analyzed in this section but with the inclusion of un-
certainties in the four demand requirements from the
user. The deterministic values that were used pre-
viously (in section III.A) are allowed to deviate by
20% for obtaining the bounds on the uncertain pa-
rameters. In practical scenario, these bounds will be
usually provided by the user or cloud broker. Now,
it is assumed that the three uncertain parameters fol-
low normal distribution and the probability of con-
straint satisfaction (p) is set to 0.75. Subsequently,
CCP was applied for solving the stochastic optimiza-
tion problem of VM allocation (Eqs. 12 – 22), which
is again an ILP and multi-objective. On converting
this stochastic formulation into equivalent determin-
istic optimization problem using Eqs. 24 to 27 and
solving it using NSGA-II, the two dimensional Pareto
optimal front is obtained as shown in Fig. 2. On com-
parison with the deterministic solution, it is observed
that the solution quality is improved with respect to
both the objective function values. Considering one
of the PO solutions, the attained decision variables
are presented in Table 4, which correspond to each
type of VM over entire time horizon (represented for
a three hour window). In this case, a total of 30 VMs
were required for executing the considered scientific
workflow, which is less in number as compared to de-
terministic case. Further, c3.2xlarge VMs were not al-
located in the entire time horizon and c3.4xlarge VMs
were allocated as opposed to deterministic solutions,
which implies that consideration of uncertainty plays
an important role in the selection of optimal VMs.
Additionally, in order to study the effect of prob-
ability of constraint satisfaction (p), the value of p is
varied from 0.75 to 1 and the corresponding solutions
Table 4: Optimal VM Configuration obtained for a Solution
chosen from the Stochastic Pareto Front (p=0.75).
Time Periods
1-3 4-6 7-9 10-12
c3.large 3 7 0 0
c3.xlarge 8 1 3 0
c3.2xlarge 0 0 0 0
c3.3xlarge 0 2 3 0
c3.4xlarge 0 0 3 0
are presented in Fig. 2. In practise, a broker can de-
cide on the p value based on the SLA agreed with the
client. It is observed that as the p value increases, the
solution quality varies, sometimes deteriorates as well
but then the reliability of the solution is more. How-
ever, choosing a too high value of p might lead to con-
servative solutions and on the other hand, a smaller p
value is also not suggestable.
5 CONCLUSIONS
This paper address the problem of VM allocation
for scientific workflow considering multi-objectives.
The aim is to minimize the purchasing cost and ex-
ecution time while satisfying the uncertainties in the
users’ demand requirements. A constrained stochas-
tic multi-objective optimization problem has been for-
mulated, and solved using Chance Constrained Pro-
gramming (CCP). The problem is converted into its
deterministic equivalent and solved using NSGA-II.
The results imply that the deterministic optimization
solutions are inferior to those obtained for stochastic
formulation, which might be apparently due to the in-
creased feasible region. Further, the effect of varying
the level of constraint satisfaction in CCP is studied.
Future studies in this direction of research could
be the following: (i) consideration of joint constraints
in both the deterministic as well as stochastic opti-
mization model, (ii) implementation of frequentist ap-
proach for calculation of probability values in CCP,
(iii) consideration of multi-cloud and other cost com-
ponents in the model.
REFERENCES
Aslam, S., ul Islam, S., Khan, A., Ahmed, M., Akhundzada,
A., and Khan, M. K. (2017). Information collection
centric techniques for cloud resource management:
Taxonomy, analysis and challenges. Journal of Net-
work and Computer Applications, 100:80–94.
Calheiros, R. N. and Buyya, R. (2013). Meeting deadlines
of scientific workflows in public clouds with tasks
CLOSER 2021 - 11th International Conference on Cloud Computing and Services Science
246
replication. IEEE Transactions on Parallel and Dis-
tributed Systems, 25(7):1787–1796.
Calzarossa, M. C., Della Vedova, M. L., and Tessera, D.
(2019). A methodological framework for cloud re-
source provisioning and scheduling of data parallel
applications under uncertainty. Future Generation
Computer Systems, 93:212–223.
Coutinho, R. d. C., Drummond, L. M., Frota, Y., and
de Oliveira, D. (2015). Optimizing virtual machine al-
location for parallel scientific workflows in federated
clouds. Future Generation Computer Systems, 46:51–
68.
Deb, K. (2015). Multi-objective evolutionary algorithms.
In Springer handbook of computational intelligence,
pages 995–1015. Springer.
Diwekar, U. M. (2020). Introduction to applied optimiza-
tion, volume 22. Springer Nature.
Fard, H. M., Prodan, R., and Fahringer, T. (2014). Multi-
objective list scheduling of workflow applications in
distributed computing infrastructures. Journal of Par-
allel and Distributed Computing, 74(3):2152–2165.
Ferdaus, M. H., Murshed, M., Calheiros, R. N., and Buyya,
R. (2017). An algorithm for network and data-aware
placement of multi-tier applications in cloud data cen-
ters. Journal of Network and Computer Applications,
98:65–83.
Heilig, L., Lalla-Ruiz, E., and Voß, S. (2016). A cloud bro-
kerage approach for solving the resource management
problem in multi-cloud environments. Computers &
Industrial Engineering, 95:16–26.
Heilig, L., Lalla-Ruiz, E., and Voß, S. (2020). Model-
ing and solving cloud service purchasing in multi-
cloud environments. Expert systems with applica-
tions, 147:113165.
Hu, H., Li, Z., Hu, H., Chen, J., Ge, J., Li, C., and Chang,
V. (2018). Multi-objective scheduling for scientific
workflow in multicloud environment. Journal of Net-
work and Computer Applications, 114:108–122.
Li, Z., Ge, J., Yang, H., Huang, L., Hu, H., Hu, H., and
Luo, B. (2016). A security and cost aware scheduling
algorithm for heterogeneous tasks of scientific work-
flow in clouds. Future Generation Computer Systems,
65:140–152.
Madni, S. H. H., Abd Latiff, M. S., Coulibaly, Y., et al.
(2016). Resource scheduling for infrastructure as a
service (iaas) in cloud computing: Challenges and op-
portunities. Journal of Network and Computer Appli-
cations, 68:173–200.
Mao, M. and Humphrey, M. (2011). Auto-scaling to min-
imize cost and meet application deadlines in cloud
workflows. In SC’11: Proceedings of 2011 Interna-
tional Conference for High Performance Computing,
Networking, Storage and Analysis, pages 1–12. IEEE.
Mitra, K. (2013). Chance constrained programming to han-
dle uncertainty in nonlinear process models. Multi-
Objective Optimization in Chemical Engineering: De-
velopments and Applications, pages 183–215.
Mohammadi, S., Pedram, H., and PourKarimi, L. (2018).
Integer linear programming-based cost optimization
for scheduling scientific workflows in multi-cloud
environments. The Journal of Supercomputing,
74(9):4717–4745.
Ning, C. and You, F. (2019). Optimization under uncer-
tainty in the era of big data and deep learning: When
machine learning meets mathematical programming.
Computers & Chemical Engineering, 125:434–448.
Odetayo, B., Kazemi, M., MacCormack, J., Rosehart,
W. D., Zareipour, H., and Seifi, A. R. (2018). A
chance constrained programming approach to the in-
tegrated planning of electric power generation, natural
gas network and storage. IEEE Transactions on Power
Systems, 33(6):6883–6893.
Pandey, S., Wu, L., Guru, S. M., and Buyya, R. (2010).
A particle swarm optimization-based heuristic for
scheduling workflow applications in cloud computing
environments. In 2010 24th IEEE international con-
ference on advanced information networking and ap-
plications, pages 400–407. IEEE.
Ramamurthy, A., Saurabh, S., Gharote, M., and Lodha, S.
(2020). Selection of cloud service providers for host-
ing web applications in a multi-cloud environment.
In 2020 IEEE International Conference on Services
Computing (SCC), pages 202–209. IEEE.
Rodriguez, M. A. and Buyya, R. (2014). Deadline based re-
source provisioningand scheduling algorithm for sci-
entific workflows on clouds. IEEE transactions on
cloud computing, 2(2):222–235.
Tchernykh, A., Schwiegelsohn, U., Talbi, E.-g., and
Babenko, M. (2019). Towards understanding uncer-
tainty in cloud computing with risks of confidentiality,
integrity, and availability. Journal of Computational
Science, 36:100581.
Wang, X. and Ning, Y. (2017). Uncertain chance-
constrained programming model for project schedul-
ing problem. Journal of the operational research so-
ciety, pages 1–9.
Wen, Y., Chen, Z., Chen, T., Liu, J., and Kang, G. (2012). A
particle swarm optimization algorithm for batch pro-
cessing workflow scheduling. In 2012 Second Inter-
national Conference on Cloud and Green Computing,
pages 645–649. IEEE.
Zeng, L., Veeravalli, B., and Zomaya, A. Y. (2015). An
integrated task computation and data management
scheduling strategy for workflow applications in cloud
environments. Journal of Network and Computer Ap-
plications, 50:39–48.
Zhu, Z., Zhang, G., Li, M., and Liu, X. (2015). Evolu-
tionary multi-objective workflow scheduling in cloud.
IEEE Transactions on parallel and distributed Sys-
tems, 27(5):1344–1357.
Multi-objective Optimization for Virtual Machine Allocation in Computational Scientific Workflow under Uncertainty
247