Impressionism in Cloud Computing

∗

A Position Paper on Capacity Planning in Cloud Computing Environments

Iv´an Carrera Izurieta and Cl´audio Resin Geyer

Institute of Informatics, Federal University of Rio Grande do Sul - UFRGS, Porto Alegre, Brazil

Keywords:

Cloud Computing, Capacity Planning, Virtualization, Performance Evaluation.

Abstract:

Cloud computing is a model that relies on virtualization and can lower costs to the user by charging only for

the computational resources used by the application. There is a way to use the advantages of cloud computing

in data-intensive applications like MapReduce and it is by using a virtual machine (VM) cluster in the cloud.

An interesting challenge with VM clusters is determining the size of the VMs that will compose the cluster,

because with an appropriate cluster and VM size, users will be able to take a full advantage of resources, i.e.,

reducing costs by using idle resources and gaining performance. This position paper is intended to bring to

consideration the necessity for accurate capacity planning at user level, in order to take fully advantage of

cloud resources and will focus specially for data-intensive applications users.

1 INTRODUCTION

Cloud computing, as deﬁned by the U. S. National In-

stitute of Standards and Technology NIST, is a model

for enabling ubiquitous, convenient, on-demand net-

work access to a shared pool of conﬁgurable comput-

ing resources (Mell and Grance, 2011). The cloud

infrastructure can be understood as an elastic shared

pool of conﬁgurable computing resources, so users

can deﬁne the features of their virtual machine char-

acteristics to meet application requirements, and in-

crease or decrease said features when the require-

ments change.

Cloud computing nowadays is a paradigm that is

becoming morepresent in many environmentsand ap-

plications. Works like (Boutaba et al., 2012) agree

that Cloud Computing is by far the most cost-effective

∗

A Brief Explanation about the Title

Impressionism was a 19th-century art movement originated

by a group of Paris-based artists. Those artists developed a

technique in which the representation of objects was made

by relatively small, thin, yet visible brush strokes. In Im-

pressionism, no objects were drawn but only represented

by these brush strokes whose size was determined by each

artist’s technique.

In a Cloud Computing environment, virtual machines

with variable size compose a cluster that behaves like a sin-

gle machine as the brush strokes did for the objects in the

Impressionism. In our case, the skill for the capacity plan-

ner will be to ﬁnd the exact size of the virtual machines that

could achieve a desired performance.

technology for hosting Internet-scale services and ap-

plications, and also say that dealing with workload

and cluster resources heterogeneity is a very impor-

tant challenge.

Virtualization is a key technology for Cloud Com-

puting because it introduces an abstraction layer that

partitions a computer system into virtual machines

as if they were physically isolated (Carissimi, 2008).

Managing computers as virtual machines that can be

modiﬁed and therefore optimized for a certain appli-

cation is a key advantage for Cloud Computing.

Distributed applications, when moved to cloud en-

vironments require a cost-effective and time-effective

resource planning for their clusters, formed now by

virtual machines. An optimal cluster size should

save money by optimizing resources (Boutaba et al.,

2012), but this optimization cannot be generalized for

all applications (Herodotou et al., 2011).

In this work we discuss the introduction of a

methodology that can help a cloud computing appli-

cation user to take advantage of the inherent elasticity

of a cloud infrastructure to determine the size of the

virtual machine cluster running in the cloud that can

have a predictible performance.

The remainder of this work will be structured

as follows: Section 2 discusses the motivation for

this work and the relevance of Capacity Planning in

cloud environments, Section 3 addresses a methodol-

ogy for determining general purpose Capacity Plan-

ning in cloud environments, Section 4 exposes an ex-

333

Carrera Izurieta I. and Resin Geyer C..

Impressionism in Cloud Computing - A Position Paper on Capacity Planning in Cloud Computing Environments.

DOI: 10.5220/0004567803330338

In Proceedings of the 15th International Conference on Enterprise Information Systems (ICEIS-2013), pages 333-338

ISBN: 978-989-8565-60-0

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

ample for using the methodology explained in Section

3, and ﬁnally Section 5 explains future work for this

approach, and ﬁnally the Conclusions of this work are

presented.

2 MOTIVATION

According to the Magellan Report, a document pre-

sented by the U.S. Department of Energy DOE in

2011 (Yelick et al., 2011), a cloud computing model

can bring to the user a big gain in terms of scalabil-

ity and usability of a computing infrastructure; but

a cloud computing model still lacks a good manage-

ment of security and performance.

In the opinion of the authors of this work, a cloud

performance management model should guide the

cloud user in the capacity planning process of its vir-

tual machine cluster in order to have a time and cost-

effective performance for its application.

In a private cloud environment, parameters of the

virtual machines such as: number of cores of the

processor, amount of RAM, bandwidth capacity and

some other features can be freely managed according

to the administrator deﬁnitions; on the other hand, in

a public cloud environment this ability is restricted by

the provider’s instance types, which are speciﬁcations

of the virtual hardware that will be provided for the

virtual machines. For a named user, a modiﬁcation to

the parameters of the virtual machines will depend of

the application, in other words, there is no single way

to tune a virtual machine (Herodotou et al., 2011) in a

cloud environment; but we believe that for a speciﬁc

application this could be accomplished and general-

ized for similar applications.

Works like (Herodotou et al., 2011) and (Boutaba

et al., 2012) say that an optimal cluster size should

save money by optimizing infrastructure cloud re-

sources for the cloud user and provider. Also, in (Mi-

etzner and Leymann, 2008) the vision of a Software

as a Service that requires Infrastucture as a Service

level management is described as very important re-

search ﬁeld and an interesting trend in cloud comput-

ing. This vision relates to the resource provisioning

based on the application performance. This optimiza-

tion can let users to take a better advantage of the re-

sources they are using in the cloud.

This way of thinking has been tested, for example

in (Wang et al., 2010), where authors evaluate the ra-

tio of the number of virtual machines and the number

of cores of a physical computer, and how that ratio

affects the performance of a distributed application

(a MapReduce application). The overall suggestion

of Wang’s work is that in a computer with n cores

there should be instantiated no more than n virtual

machines. But, as we have said, there are more pa-

rameters that can be optimized for a given application

in a cloud computing environment.

Another interesting work is (Ibrahim et al., 2009)

where virtual machines are tuned for data-intensive

applications, but the master node is shown as a sin-

gle point of failure (SPoF) in a MapReduce applica-

tion. In that case, the master node is recommended

to be instantiated on a physical machine rather than

a virtual machine. Ibrahim’s work does not consider

a cloud computing middleware, as it works directly

with XEN based virtual machines. In a cloud comput-

ing environment this way of thinking is valid, but will

not necessarily present the same results, and likely the

experiments will not be the same either.

So, by citing these works we show the relevance

for this particular issue of determining a-priori the

size of a virtual machine cluster for distributed ap-

plications to be run in the cloud, with the objective of

having a time- and/or cost-effective performance.

The approach intended in this work is a way to

present a methodology that can guide users ﬁnding a

capacity planning model for cloud computing appli-

cations by modeling the performance of an speciﬁc

application.

3 PROPOSED METHODOLOGY

In this section it will be explained how a capacity

planning model should be done for any speciﬁc cloud

computing application.

3.1 Deﬁne the Parameters that Affect

the Performance

For any application, it should be determined which

characteristics of the cluster affect its performance.

As it has been said, virtual machines in a private

cloud can be conﬁgured by customizing the number

of cores, amount of RAM, storage capacity and some

other features depending of the cloud computing mid-

dleware. All of these parameters could affect the per-

formance in some way or another, the key will be to

determine for an speciﬁc application whether or not

this performance variations are signiﬁcant.

In a public cloud environment these factors are

not so easily modiﬁable, because of the instance types

each cloud platform deﬁnes. So, the instance type is

per-se a unique factor and should be treated as such.

Also, some other application-speciﬁc parameters

affect the performance of an application, and they

could be as diverse as the kind of applications any

ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems

334

user could want to run in a cloud computing envi-

ronment. For distributed applications, which is our

study ﬁeld, the workload is certainly an important

factor that affects the performance of the cluster. In

MapReduce applications it is not only important to

know the amount of workload that a certain applica-

tion wants to process, but it is also important to know

the type of workload it is. Works like (Abad et al.,

2012) study the kinds of workloads a MapReduce ap-

plication can have, and other works like (Sangroya

et al., 2012) even characterize the workloads on sev-

eral groups, according to their types.

Another very important parameter is the number

of virtual machines that would compose the cluster.

Depending on their escalability and degree of paral-

lelism, some applications could beneﬁt from running

in bigger or smaller clusters.

Deﬁning these parameters will help users to focus

on what is actually affecting the performance.

3.2 Perform Observations

After the parameters that affect the performance are

deﬁned, observations of the behavior of the appli-

cation have to be performed. Observations generate

metrics, e.g. execution time, throughput, and time be-

tween errors (Jain, 1991), and are done with probes

inserted into a normal implementation of the applica-

tion. The selection of these metrics depends on the

nature of the application and what the user deﬁnes as

performance. By varying independently each factor, a

user could observe which parameters are responsible

for how much of the variation on the performance.

This study is known as general full factorial exper-

imental design, which is a generalization of the 2

experimental design proposed in (Jain, 1991).

This approach has its disadvantages because it

could be very extensive to test each and every combi-

nation of parameters. For example, if a public cloud

user wants to test its application

with:

• 3 types of workload,

• 4 sizes of each workload,

• 4 instance types for the virtual machines,

• and 3 different cluster sizes,

this will make a total of:

3 x 4 x 4 x 3 = 144 experiments

and each experiment run 30 times in order to have

statisticly signiﬁcant results.

144 x 30 = 4320 iterations in total.

If it were in a private cloud, there would be even more

variations for instance types

4320 iterations could be impractical to perform.

So, some of the cases should be cut out of the ex-

perimental design, using the fractional factorial ex-

perimental design explained in (Jain, 1991), mainly

because some of the combinations will be more im-

portant than others.

Observations done at this step will provide the in-

formation to continue with the following mathemati-

cal modeling.

3.3 Propose a Model

From the observations done in the previous step, some

measures must be taken. If the goal of the model is to

improve the performance, perhaps the execution time

should be the main metric to be measured, but also

some other metrics could be important to model the

application performance like: energy consumption,

throughput, resource utilization, time between errors,

and some others.

After enough metrics are taken, depending on the

nature of the observations, using regression models

and statistical tools, a mathematical model can be pro-

posed (Jain, 1991). There are some general ways that

can help the user to propose a model, for example,

the number of instances are inversely proportional to

the execution time (the more machines compose the

cluster, lesser the time the application will need to be

performed, until it reaches a limit); some other met-

rics could be directly proportional, like the amount of

workload (the more workload it is to be processed, the

more time the application will need to be performed).

So, some previous criteria should help the user to

propose a model that will relate the parameters spec-

iﬁed in section 3.1 to the metrics taken from the ob-

servations.

3.4 Assess the Model

Once the user has got a model that speciﬁes a math-

ematical relation for the parameters of the cluster

and the application with the performance metrics, this

model will have to be proven by experimentation.

The experiments done at this step should conﬁrm

(or not) the mathematical model with a certain con-

ﬁdence interval. So, values of the parameters should

be given and, with these values, the metrics should be

calculated a-priori (before the experiments) and then

the experiments should be performed in order to con-

ﬁrm the calculations.

This is where things get interesting, because if

the model is not quite correct, the results in this step

should give feedback to correct and tune the mathe-

matical model.

ImpressionisminCloudComputing-APositionPaperonCapacityPlanninginCloudComputingEnvironments

335

For some help to the user that is modeling the per-

formance of the application, some limits for the pa-

rameters should also be given, like saying that some

model is useful if the parameters are between some

limits speciﬁed by the application. This could help

making the model easier to deﬁne, but at the same

time this would take credibility from the model. The

strength of a model is its generalization.

In ﬁgure 1 we present a simple ﬂowchart that sums

up the steps described in this section:

Figure 1: Flowchart for developing a model for capacity

planning.

4 A PRACTICAL EXAMPLE

We now present an example on how the methodol-

ogy presented in this work should be put into practice.

We performed some experiments to determine which

componentsof the virtual machines in the cluster have

an incidence on the performance of the Map Reduce

application.

4.1 Description of the Experiments

For this experiments, the System Under Testing, SUT,

was a cluster of four virtual machines inside a cloud

environment. Inside of this SUT the component stud-

ied was the cluster as a whole. Understanding the

cluster as a whole gave us the idea that the cluster

behaves as a single thing, and not as a collection of

several components, which was better for improving

the cluster performance.

The speciﬁed system ran a MapReduce applica-

tion, particularly a Word Count application. And in

the experimentation, the errors in execution were not

considered, so all outcomes were considered as cor-

rect.

4.1.1 Parameters

The factor to study in these experiments was one: the

amount of RAM of each virtual machine. The work-

load was ﬁxed to a 512MB ﬁle of plain-text data for

the Word count application.

Three values of RAM were tested: 512MB,

1024MB and 1536MB. The experiments were per-

formed 40 times.

4.1.2 Observations

The speed of the system was considered as the main

and only criteria for these experiments. The execution

time was taken from the log ﬁles provided by the Map

Reduce application.

For these experiments, the following conﬁgura-

tion was used:

• For the Cloud infrastructure:

– Cloud middleware: Eucalyptus 3.1.2

– Virtual machine hypervisor: KVM

– 4 physical cloud nodes: 1 master, 3 slaves

– Node: Dual-core, Intel-VT, 4GB RAM,

1GbEthernet, 180GB HDD

• For the Map Reduce application:

– Virtual machine cluster of 4 virtual nodes

– Apache Hadoop 1.1.1 (released in Nov. 2012)

– Example Wordcount application, included in

the

– Hadoop 1.1.1 package

– Workload: 512MB of plain-text data

– Metrics: execution time obtained from log ﬁles

4.1.3 Mathematical Model

Two regression models were applied to the taken data

from the experiments.

For a linear model like:

time = β

+ β

∗ RAM (1)

Using the linear regression tool in R software

The

two constants were calculated to:

time = 279, 59− 0, 05∗ RAM (2)

This linear model has a coefﬁcient of determination

= 0,8087.

The R Project for Statistical Computing

http://www.r-project.org/

The coefﬁcient of determination is a value that mea-

sures the accuracy of a model. A model is better when R

gets closer to 1 (Jain, 1991).

ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems

336

And, for a quadratic model like:

time = β

+ β

∗ RAM + β

∗ RAM

(3)

The three constants were calculated to:

time = 259− 1, 044∗ 10

−2

∗ RAM − 2, 344∗ 10

−5

∗ RAM

(4)

This quadratic model has an R

= 0,8187

4.1.4 Assessing the Model

Both values of R

cannot help to say if the relation

between RAM and the execution time is actually lin-

ear or quadratic, so, more tests will be needed. And

visually, no tendencies can be seen when the results

are plotted:

Figure 2: Execution time as a function of RAM of each VM.

Also, we can see that the results obtained with

512MB of RAM had a much bigger dispersion than

the results with 1024 or 1536MB. This shows that

512MB of RAM was not an efﬁcient size of memory

for the application. Some further work should include

more values of RAM for evaluating the performance,

so the relation between RAM and the execution time

can be modeled in a more precise way.

So, the conclusions for these example experiments

can guide for new experiments and give feedback.

5 FUTURE WORK

AND CHALLENGES

The objective of this work is to present a model that

can help users to ﬁnd a capacity planning model for

cloud computing applications.

This work was motivated by a research that is

currently being performed with MapReduce applica-

tions, so, the future work is to apply this capacity

planning experimental model to MapReduce applica-

tions running in a cloud computing environment.

With MapReduce applications there are also some

more parameters to consider, one of them is the het-

erogeneity of the cluster. There are some works that

address heterogeneity issues, like (Anjos et al., 2012)

and (Zaharia et al., 2008), which stablish that a vir-

tual machine cluster should be homogeneous in order

to have a better performance, but they also say that in

a cloud computing environment it is hard to have ho-

mogeneity in terms of disk and network I/O because

of the co-location of virtual machines. It means that

sometimes other virtual machines running in the same

physical server could affectthe performance of virtual

machines making harder to model the behavior of the

cluster.

So, the I/O heterogeneity is an important chal-

lenge that should be assessed when modeling capac-

ity planning for MapReduce applications. In this

case, testing high performance storage platforms like

the ones optimized for CDNs (Content Delivery Net-

works) is strongly recommended because will present

different performance models, and also an interesting

challenge.

When it comes to cloud computing environments,

an important issue to assess is the costs that using

a public cloud generates, so another interesting fu-

ture work could be determining the price that could

take a certain amount of workload to be processed

and which speciﬁc parameters of a virtual machine

cluster would provide the most cost-effective combi-

nation. This would have to take into account that not

always the most time-effective combination will also

be the most cost-effective.

Also, the methodology presented in this work

should help modeling other cloud computing dis-

tributed applications, so we encourage other re-

searchers to try to conﬁrm the methodology with

other parallel application problems running in a cloud

computing environment like graphic computation,

distributed ﬁlesystems, data processing, content de-

livery, search engines, and others.

6 CONCLUSIONS

As we have said in this work, this simple but general

methodology could come handy when modeling the

behavior of a parallel application in a cloud comput-

ing environment.

Determining a cluster size for a cloud comput-

ImpressionisminCloudComputing-APositionPaperonCapacityPlanninginCloudComputingEnvironments

337

ing application will help users to improve the perfor-

mance and the time they pay for using the cloud in-

frastructure. Also, by optimizing the use of an infras-

tructure we will be offering more resources for other

users.

The reason why we present this methodology as

a new contribution is that it helps cloud computing

users take advantage of the exclusive feature of the

cloud, like elasticity, freedom for sizing the virtual

machines, the shorter time that would take for a sys-

tem administrator review his cluster sizing and per-

form changes on it.

This methodology could be used in other dis-

tributed environments, but it is in the cloud that a user

can really play with several conﬁgurations of virtual

machines.

We want to present this methodology for applica-

tions running in the cloud as general as we can, so

that any user can take advantage of it. The future

work indicated in the previous section will conﬁrm

(or not) if this methodology is general enough to have

the strength to help deﬁning a mathematical model for

MapReduce applications, and hopefully some other

cloud applications as well.

ACKNOWLEDGEMENTS

This work was made with the support of the Programa

Estudantes-Convˆenio de P´os-Graduac¸˜ao PEC-PG, of

CAPES/CNPq - Brazil.

This position paper was partially funded by Ed-

ital FAPERGS-CNPq n. 008/2009 Research

Project: GREEN-GRID: Sustainable High Perfor-

mance Computing, Edital MCT/CNPq No 70/2009

PGAESTMCT/CNPq

Also, we would like to thank Professor Arnaud

Legrand from the CNRS, Centre National de la

Recherche Scientiﬁque, and the University of Greno-

ble in France for his help and guidance in simplifying

the methodology presented in this work.

And a ﬁnal acknowledgement to the special per-

son who inspired this vision of Impressionism in

Cloud Computing a few years ago by explaining the

art movement.

REFERENCES

Abad, C. L., Roberts, N., Lu, Y., and Campbell, R. H.

(2012). A storage-centric analysis of mapreduce

workloads: File popularity, temporal locality and ar-

rival patterns. In Workload Characterization (IISWC),

2012 IEEE International Symposium on, pages 100–

109. IEEE.

Anjos, J., Kolberg, W., Geyer, C. R., and Arantes, L. B.

(2012). Addressing data-intensive computing prob-

lems with the use of mapreduce on heterogeneous en-

vironments as desktop grid on slow links. In Com-

puter Systems (WSCAD-SSC), 2012 13th Symposium

on, pages 148–155. IEEE.

Boutaba, R., Cheng, L., and Zhang, Q. (2012). On cloud

computational models and the heterogeneity chal-

lenge. Journal of Internet Services and Applications,

3(1):77–86.

Carissimi, A. (2008). Virtualizac¸˜ao: da teoria a soluc¸˜oes.

Minicursos do Simp´osio Brasileiro de Redes de

Computadores–SBRC, 2008:173–207.

Herodotou, H., Dong, F., and Babu, S. (2011). No one

(cluster) size ﬁts all: automatic cluster sizing for data-

intensive analytics. In Proceedings of the 2nd ACM

Symposium on Cloud Computing, page 18. ACM.

Ibrahim, S., Jin, H., Lu, L., Qi, L., Wu, S., and Shi, X.

(2009). Evaluating mapreduce on virtual machines:

The hadoop case. In Cloud Computing, pages 519–

528. Springer.

Jain, R. (1991). The art of computer systems performance

analysis, volume 182. John Wiley & Sons Chichester.

Mell, P. and Grance, T. (2011). The nist deﬁnition of cloud

computing (draft). NIST special publication, 800:145.

Mietzner, R. and Leymann, F. (2008). Towards provisioning

the cloud: On the usage of multi-granularity ﬂows and

services to realize a uniﬁed provisioning infrastructure

for saas applications. In Services-Part I, 2008. IEEE

Congress on, pages 3–10. IEEE.

Sangroya, A., Serrano, D., and Bouchenak, S. (2012).

Benchmarking dependability of mapreduce systems.

In Reliable Distributed Systems (SRDS), 2012 IEEE

31st Symposium on, pages 21–30. IEEE.

Wang, P., Huang, W., and Varela, C. A. (2010). Impact of

virtual machine granularity on cloud computing work-

loads performance. In Grid Computing (GRID), 2010

11th IEEE/ACM International Conference on, pages

393–400. IEEE.

Yelick, K., Coghlan, S., Draney, B., Canon, R. S., et al.

(2011). The magellan report on cloud computing for

science. US Department of Energy Ofﬁce of Sci-

ence, Ofﬁce of Advanced Scientiﬁc Computing Re-

search (ASCR) December.

Zaharia, M., Konwinski, A., Joseph, A. D., Katz, R., and

Stoica, I. (2008). Improving mapreduce performance

in heterogeneous environments. In Proceedings of the

8th USENIX conference on Operating systems design

and implementation, pages 29–42.

ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems

338