Making Cloud-based Systems Elasticity Testing Reproducible

Michel Albonico

1,3

, Jean-Marie Mottu

, Gerson Sunye

and Frederico Alvares

AtlanModels Team (Inria, IMT-A, LS2N), France

Ascola Team (Inria, IMT-A, LS2N), France

Federal Technological University of Parana, Francisco Beltr

ao, Brazil

Keywords:

Cloud Computing, Elasticity, Elasticity Testing, Controllability, Reproducibility.

Abstract:

Elastic Cloud infrastructures are capable of dynamically varying computational resources at large scale, which

is error-prone by nature. Elasticity-related errors are detected thanks to tests that should run deterministically

many times all along the development. However, elasticity testing reproduction requires several features not

supported natively by the main cloud providers, such as Amazon EC2. We identify three requirements that we

claim to be indispensable to ensure elasticity testing reproducibility: to control the elasticity behavior, to select

speciﬁc resources to be unallocated, and coordinate events parallel to elasticity. In this paper, we propose an

approach fulﬁlling those requirements and making the elasticity testing reproducible. Experimental results

show that our approach successfully reproduces elasticity-related bugs that need the requirements we claim in

this paper.

1 INTRODUCTION

Elasticity is one of the main reasons that makes cloud

computing an emerging trend. It allows to vary (al-

locate or deallocate) system resources according to

demand (Herbst et al., 2013; Bersani et al., 2014).

Therefore, Cloud-Based Systems (CBS) must adapt

themselves to resource variations.

Due to their scalability nature, CBS adaptations

may be complex and error-prone (Herbst et al., 2013).

This is the case, for instance, of bug 2164 that affects

Apache ZooKeeper (Hunt et al., 2010), a coordination

service for large-scale systems. According to the bug

report, the election of a new leader never ends when

the leader leaves a system with three nodes. This

happens when the resource hosting the leader node

is deallocated.

Furthermore, some bug reproductions require

more than adaptation tasks. This is the case of Mon-

goDB NoSQL database bug 7974, where elasticity

and consequently, adaptation tasks, happen in paral-

lel with other events. In that case, even though adap-

tation tasks are still a requirement, they are not the

unique cause of the bug.

Reproducing bugs several times is necessary to di-

agnose and to correct them. That is, bugs reported in a

CBS bug tracking should be corrected by developers,

who need to reproduce them, at ﬁrst. Moreover, dur-

ing the development, regression tests should be run

regularly (Engstrom et al., 2010), requiring the de-

sign of efﬁcient and deterministic tests. Making CBS

testing reproducible is then an important concern.

To verify the existence of elasticity-related bugs

in the real-world, we analyze MongoDB’s bug track-

ing, a popular CBS. In the bug tracking we identify

a total of 43 bugs that involve elasticity. Since these

bugs occur in real-world, testers must manage elas-

ticity when writing and running their tests. However,

writing and executing tests considering the reproduc-

tion of elasticity-related bugs is complex. Besides the

typical difﬁculty of writing test scenarios and oracles,

it requires to deterministically manage the elasticity

and parallel events.

The management of elasticity involves driving

the system through a sequence of resource changes,

which is possible by varying the workload accord-

ingly (Albonico et al., 2016). In addition, tester

should also manage parameters that cannot be na-

tively controlled. For instance, resource changes are

managed by the cloud provider according to its own

policy. This prevents the tester to choose which re-

source must be deallocated, a requirement for 19 of

the 43 MongoDB’s elasticity-related bugs. The coor-

dination of events in parallel to elasticity (17/43 bugs)

is also an important parameter, which among others

needs a precise monitoring of resources.

Albonico, M., Mottu, J-M., Sunyé, G. and Alvares, F.

Making Cloud-based Systems Elasticity Testing Reproducible.

DOI: 10.5220/0006308905230530

In Proceedings of the 7th International Conference on Cloud Computing and Services Science (CLOSER 2017), pages 495-502

ISBN: 978-989-758-243-1

495

In this paper, we present an approach for elasticity

testing that allows the tester to reproduce tests con-

cerned with elasticity-related bugs. We control the

elastic behavior by sending satisfactory workload to

CBS that drives it through a sequence of resource al-

locations and deallocations. Our approach also pro-

vides two original contributions: the deallocation of

speciﬁc resources, and the coordination of events in

parallel with elasticity. The two original contribu-

tions are required to reproduce 30 of the 43 elasticity-

related bugs of MongoDB, i. e., ≈70 %. In the paper,

we also present a prototype for the approach.

To support our claims and to validate our ap-

proach, we select 1 representative bug, and conducted

a set of experiments on Amazon EC2. The selected

bug requires our approach to satisfy all the require-

ments we identiﬁed. Experiment results show that

our approach reproduces real-world elasticity-related

bugs fulﬁlling all the requirements.

The remainder of this paper is organized as fol-

lows. In the next section, we remind the major as-

pects of cloud computing elasticity. Section 3 details

the requirements we claim be necessary for elasticity

testing. Section 4 introduces the testing approach we

propose. The experiment and its results are described

in Section 5. Section 6 discusses threats to validity.

Finally, Section 7 concludes.

2 CLOUD COMPUTING

ELASTICITY

This section provides the deﬁnitions of the main con-

cepts related to Cloud Computing Elasticity that are

required for the good understanding of our approach.

2.1 Typical Elastic Behavior

Figure 1 presents the typical behavior of elastic cloud

computing applications (Albonico et al., 2016). In

this ﬁgure, the resource demand (continuous line)

varies over time, increasing from 0 to 1.5 then de-

creasing to 0. A resource demand of 1.5 means that

the application demands 50 % more resources than the

current allocated ones.

When the resource demand exceeds the scale-out

threshold and remains higher during the scale-out re-

action time, the cloud elasticity controller assigns a

new resource. The new resource becomes available

after a scale-out time, the time the cloud infrastruc-

ture spends to allocate it. Once the resource is avail-

able, the threshold values are updated accordingly. It

is similar considering the scale-in, respectively. Ex-

cept that, as soon as the scale-in begins, the threshold

1.5

Resource Allocation

Resource Demand

Scale-out Threshold

Scale-in Threshold

Scale-out Threshold Breaching

Scale-in Threshold Breaching

Time (s)

Resource

(Processors)

0.4

0.8

1.6

scale-out reaction time

scale-out time

scale-in reaction time

scale-in time

Legend

Figure 1: Typical Elastic Behavior.

values are updated and the resource is no longer avail-

able. Nonetheless, the infrastructure needs a scale-in

time to release the resource.

2.2 Elasticity States

When an application is deployed on a cloud infras-

tructure, workload ﬂuctuations lead to resource vari-

ations (elasticity). These variations drive the applica-

tion to new, elasticity-related, states. Figure 2 depicts

the runtime lifecycle of an application running on a

cloud infrastructure.

ready (ry)

scaling

-out (so)

steady

(ry_s)

si reaction

(ry_sir)

scaling

-in (si)

so reaction

(ry_sor)

Figure 2: Elasticity States.

At the beginning the application is at the ready

state (ry), when the resource conﬁguration is steady

(ry s substate). Then, if the application is exposed

for a certain time (scale-out reaction time, ry sor sub-

state) to a pressure that breaches the scale-out thresh-

old, the cloud elasticity controller starts adding a new

resource. At this point, the application moves to the

scaling-out state (so) and remains in this state while

the resource is added. After a scaling-out, the appli-

cation returns to the ready state. It is similar with the

scaling-in state (si), respectively.

2.3 Elasticity Control

We can categorize elasticity control approaches in

two groups: (i) direct resource management, and (ii)

generation of adequate workload.

The ﬁrst and simplest one (i) interacts directly

with the cloud infrastructure, asking for resource al-

location and deallocation. The second one (ii) con-

sists in generating adequate workload variations that

drive CBS throughout elasticity states, as previously

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

496

explained in Section 1. It is more complex since re-

quires a preliminary step for proﬁling the CBS re-

source usage, and calculating the workload variations

that trigger the elasticity states.

3 REQUIREMENTS FOR

ELASTICITY TESTING

REPRODUCTION

In this paper, we consider three requirements for re-

producing elasticity testing: elasticity control, selec-

tive elasticity, and events scheduling.

Elasticity Control is the ability to reproduce a spe-

ciﬁc elastic behavior. Elasticity-related bugs may oc-

cur after a speciﬁc sequence of resource allocations

and deallocations. Logically, all the 43 elasticity-

related bugs of MongoDB should satisfy this require-

ment.

Analyzing elasticity-related bugs, we identify two

other requirements: selective elasticity, and event

scheduling. Those requirements are necessary to re-

produce 30 of the 43 selected bugs.

Selective Elasticity is the necessity to specify pre-

cisely which resource must be deallocated. The repro-

duction of some elasticity-related bugs requires a de-

terministic management of elasticity changes. For in-

stance, deallocating a resource associated to the mas-

ter component of a cloud-based system. From the se-

lected bugs, 19 require selective elasticity.

Event Scheduling is the necessity to synchronize

elasticity changes with parallel events. We consider

as an event any interaction or stimulus to CBS, such

as forcing a data increment or to simulate infrastruc-

ture failures. The reproduction of 17 of MongoDB

elasticity-related bugs requires event scheduling.

Table 1 shows the number of bugs that should

meet those requirements in order to be reproducible.

As already said, all the bugs require elasticity con-

trol.A total of 30 bugs (70 %) should meet require-

ments besides elasticity, where 6 bugs need all the

requirements. Finally, 13 bugs only need elasticity

control (30 %) .

Table 1: Requirements for Bug Reproductions.

Elasticity

Control

Selective

Elasticity

Event

Scheduling All

Only

Elasticity

Control

Quantity 43 19 17 6 13

4 ELASTICITY TESTING

APPROACH

In this section, we present the overall architecture of

our approach as well as the aspects on the prototype

implementation.

4.1 Architecture Overview

Figure 3 depicts the overall architecture of our ap-

proach. The architecture has four main components:

Elasticity Controller Mock (ECM), Workload Gener-

ator (WG), Event Scheduler (ES), and Cloud Monitor

(CM).

Elasticity

Controller

Mock

Event

Scheduler

E = {ec1=(s

),ec

, …, ec

}

SER = {(ec

, ser

), …}

Events Schedule

Workload

Generator

Cloud

System

Workload

Current Elasticity State

Event Execution

Resource Variation

Cloud

Monitor

Figure 3: Overall Architecture.

The ECM simulates the behavior of the cloud

provider elasticity controller, allocating and deal-

locating determined resources, according to testing

needs. It also asks the WG to generate the workload

accordingly, reproducing a realistic scenario. The

role of the ES is to schedule and execute a sequence

of events in parallel with the other components. Fi-

nally, the CM monitors the cloud system, gathering

information that helps orchestrating the behavior of

the three other components, ensuring the sequence

of elasticity states, and their synchronization with the

events.

Table 2 summarizes the requirements that each

component helps in ensuring, as we detail in this sec-

tion.

Table 2: Requirements Ensured by the Architecture’s Com-

ponents.

COMPONENT

Elasticity

Control

Selective

Elasticity

Event

Scheduling

ECM YES YES YES

WG YES NO NO

ES NO NO YES

CM YES YES NO

Making Cloud-based Systems Elasticity Testing Reproducible

497

4.1.1 Elasticity Controller Mock

The ECM is designed to reproduce the elastic behav-

ior. By default, ECM requires as input a sequence of

elasticity changes, denoted by E = {ec

, ec

, ..., ec

where each ec is a pair that corresponds to an elas-

ticity change. Elasticity change pairs are composed

of a required elasticity state (s

) and a workload (W

, W

where 1 ≤ i ≤ n. A workload is charac-

terized by an intensity (i. e., amount of operations per

second), and a workload type (i. e., set of transactions

sent to the cloud system).

ECM reads elasticity change pairs sequentially.

For each pair, ECM requests resource changes to meet

elasticity state s

and requests the Workload Genera-

tor to apply the workload W

. Indeed, we have to send

the corresponding workload to prevent cloud infras-

tructure to provoke unexpected resource variations.

In particular, it could deallocate a resource that ECM

just allocated, because the workload has remained low

and under the scale-in threshold.

Instead of waiting for the cloud computing infras-

tructures for elasticity changes, it directly requests

to change the resource allocation (elasticity control).

Based on both, required elasticity state and workload

(elasticity change pair), ECM anticipates the resource

changes. To be sure CBS enters the expected elas-

ticity state, ECM queries the Cloud Monitor, which

periodically monitors the cloud infrastructure.

The ECM may also lead to a precise resource

deallocation (selective elasticity). Typically, elastic-

ity changes are transparent to the tester, managed by

the cloud provider. To set up the selective elastic-

ity, ECM requires a secondary input, i. e., Selective

Elasticity Requests (SER). SER is denoted by SER =

{(ec

, ser

), ..., (ec

, ser

)}, where ec

∈ E, and ser

refers to a selective elasticity request. A selective

elasticity request is a reference to an algorithm (freely

written by tester) that gets a resource’s ID. When ec

is performed by ECM, the algorithm referred by ser

is executed, and the resource with the returned ID is

deallocated by ECM.

ECM helps in ensuring all of elasticity testing re-

quirements. As earlier explained in this section, it de-

terministically requests resource variations (elasticity

control and selective elasticity), and helps on ensur-

ing the event scheduling providing information of the

current elasticity state to the Event Scheduler.

4.1.2 Workload Generator

The Workload Generator is responsible for generat-

ing the workload (W ). We base it on Albonico et

al. work (Albonico et al., 2016), which takes into

account a threshold-based elasticity (see Figure 1),

where resource change demand occurs when a thresh-

old is breached for a while (reaction time). Therefore,

a workload should result in either threshold breached

(for scaling states) or not breached (for ready state),

during the necessary time. To ensure this, the Work-

load Generator keeps the workload constant, either

breaching a threshold or not, until a new request ar-

rives.

Considering a scale-out threshold is set as 60% of

CPU usage, the workload should result in a CPU us-

age higher than 60% to request a scale-out. In that

case, if 1 operation A hypothetically uses 1% of CPU,

it would be necessary at least 61 operations A to re-

quest the scale-out. On the other hand, less than 61

operations would not breach the scale-out threshold,

keeping the resource steady.

The Workload Generator contributes with the

Elasticity Controller Mock to the elasticity control re-

quirement.

4.1.3 Event Scheduler

The Event Scheduler input is a map associating sets of

events to elasticity changes (ec

), i. e., the set of events

that should be sent to the cloud system when a given

elasticity change is managed by the ECM. Table 3 ab-

stracts an input where four events are associated to

two elasticity changes.

Table 3: Events Schedule.

Elasticity Change Event ID Execution Order Wait Time

e1 1 0 s

e2 2 10 s

e3 2 0 s

e2 1 0 s

e4 2 0 s

Periodically, the Event Scheduler polls the ECM

for the current elasticity change, executing the events

associated to it. For instance, when the ECM manages

the elasticity change ec

, it executes the events e

, e

and e

. Events have execution orders, which deﬁne

priorities among events associated to the same state:

event e

is executed before events e

and e

. Events

with the same execution order are executed in paral-

lel (e. g., e

and e

). Events are also associated to a

wait time, used to delay the beginning of an event. In

Table 3, event e

has a wait time of 10 s (starting 10 s

after e

, but nonetheless executed in parallel). This

delay may be useful, for instance, to add a server to

the server list a few seconds after the ready state be-

gins, waiting for data synchronization to be ﬁnished.

The Event Scheduler ensures the event scheduling

requirement.

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

498

4.1.4 Cloud Monitor

Cloud Monitor helps ECM to ensure elasticity control

and selective elasticity. It periodically requests cur-

rent elasticity state and stores it in order to respond

to the ECM queries, necessary for elasticity control.

It also executes the selective elasticity algorithm of

SER, responding to ECM with the ID of the found

resources.

4.2 Prototype Implementation

Each component of the testing approach architecture

is implemented in Java and communicate with each

other through Java RMI. Currently, we only support

Amazon EC2 interactions, though one could adapt

our prototype to interact with other cloud providers.

4.2.1 Elasticity Controller Mock

The elasticity changes are described in a property ﬁle.

The entries are set as

key, value

pairs, as presented

in Listing 1. The key corresponds to the elasticity

change name, while the value corresponds to the elas-

ticity change pair. The ﬁrst part of the value is the

elasticity state, and the second part is the workload,

divided into intensity and type.

Listing 1: Example of Elasticity Controller Mock Input File

(Elasticity Changes).

ec1=ready, (1000,write)

ec2=scaling−out, (2000,read/write)

...

ec4=scaling−in, (1500,read)

As previously explained, for each entry, the ECM

sends the workload parameters to the Workload Gen-

erator and deterministically requests the speciﬁed

resource change. Resource changes are requested

through the cloud provider API, which enables re-

source allocation and deallocation, general infrastruc-

ture settings, and monitoring tasks. Before perform-

ing an elasticity change, the ECM asks the Cloud

Monitor whether the previous elasticity state was

reached. Cloud Monitor uses the Selenium

auto-

mated browser to gather pertinent information from

cloud provider’s dashboard Web page.

We use Java annotations to set up selective elastic-

ity requests (SER), as illustrated in Listing 2. A Java

method implements the code that identiﬁes a speciﬁc

resource and returns its identiﬁer as a String type. It

is annotated with metadata that speciﬁes its name and

associated elasticity change.

http://www.seleniumhq.org/

Listing 2: Selective Elasticity Input File.

@Selection{name=”ser1”, elasticity change=”ec4”}

public String select1() {

... //code to ﬁnd a resource ID

return resourceID; }

4.2.2 Workload Generator

The WG generates the workload according to the pa-

rameters received from the ECM (i. e., workload type

and intensity), whereas the workload is cyclically gen-

erated until new parameters arrive. It uses existing

benchmark tools, setting the workload parameters in

the command line.

For instance, YCSB benchmark tool allows three

parameters related to the workload: the preset work-

load proﬁle, the number of operations, and the num-

ber of threads. The preset workload proﬁle refers to

the workload type, while the multiplication of the two

last parameters results in the workload intensity.

4.2.3 Event Scheduler

Event schedule is set in a Java ﬁle, where each event

is an annotated method, such as the example illus-

trated in Listing 3. Java methods are annotated with

the event identiﬁer, the related elasticity change, the

order, and the waiting time. EC periodically polls the

ECM to obtain the current elasticity change. Then, it

uses Java Reﬂection to execute the Java methods re-

lated to it.

Listing 3: Example of Event Scheduler Input File.

@Event{id=”e1”,elasticity change=”ec1”,

order=”1”, wait=”0”}

public void event1() { ... }

4.3 Prototype Execution

Before executing our approach’s prototype, testers

must deploy the CBS, which is done using an exist-

ing approach based on a Domain Speciﬁc Language

(DSL) by Thiery et al. (Thiery et al., 2014). The DSL

enables us to abstract the deployment complexities in-

herent to CBS. Information in deployment ﬁle, such

as cloud provider’s credentials and Virtual Machine’s

conﬁguration, are then used by the components of our

prototype. However, since the deployment of CBS is

not a contribution of our approach, we do not further

explain it in this paper.

To execute our approach’s prototype, testers write

the input ﬁles: E, SER, and ES ﬁles. These ﬁles as

well as the deployment ﬁle are passed to the prototype

as command line parameters. Then, all the execution

is automatically orchestrated.

Making Cloud-based Systems Elasticity Testing Reproducible

499

5 EXPERIMENTS

This section presents the experiment where we at-

tempt to reproduce the MongoDB bug 7974. We be-

lieve that the bug is representative, since it needs all

the requirements we claim in this paper in order to

be identiﬁed. The experiments consists in attempt-

ing to reproduce the bugs in two ways: using our ap-

proach, and relying on the cloud computing infras-

tructure. Then, we compare both approaches to verify

whether the requirements that we identify in this pa-

per are met.

5.1 Experimental Environment

5.1.1 Case Study

MongoDB

is a NoSQL document database. Mon-

goDB has three different components: the conﬁgu-

ration server, MongoS and MongoD. The conﬁgura-

tion server stores metadata and conﬁguration settings.

While MongoS instances are query routers, which en-

sure load balance, MongoD instances store and pro-

cess data.

To generate the workload for the experiment with

MongoDB, we use the Yahoo Cloud Serving Bench-

mark (YCSB) (Cooper et al., 2010).

5.1.2 Cloud Computing Infrastructure

All the experiments are conducted on the commer-

cial cloud provider Amazon Elastic Cloud Compute

(EC2), where we set scale-out and scale-in thresh-

olds as 60 % and 20 % of CPU usage, respectively.

MongoS instance is deployed on a large machine

(m3.large), while the other instances are deployed on

medium machines (m3.medium).

5.1.3 Requirements for Bug Reproduction

This bug affects the MongoDB versions 2.2.0 and

2.2.2, when a secondary component of a MongoDB

replica set

is deallocated. Indeed, in a MongoDB

replica set, one of the components is elected as pri-

mary member, which works as a coordinator, while

the others remain as secondary members.

To reproduce this bug, we must follow a speciﬁc

elastic behavior: initialization of a replica set with

three members, deallocation of a secondary member,

and allocation of a new secondary member. There-

fore, the second step of the elastic behavior requires

the deallocation of a precise resource, one of the

https://www.mongodb.org/

https://docs.mongodb.com/replica-set

secondary members. The bug reproduction also re-

quires two events synchronized to elasticity changes.

Right after the secondary member deallocation, we

must create a unique index, and after the last step of

the elastic behavior, we must add a document in the

replica set.

In conclusion, the reproduction of this bug needs

to meet all the requirements that we consider in this

paper: elasticity control, selective elasticity, and event

scheduling.

5.2 Bug Reproduction

In this section, we describe the use of our approach to

reproduce the selected bug, and compare the results to

reproduction attempts without our approach. We do

not explain in details the setup of reproductions with-

out our approach but we assume one can manage the

control elasticity and meet this requirement. Indeed,

elasticity is a native feature of cloud computing in-

frastructures, and we just drive CBS through required

elastic behavior using Albonico et al. approach (Al-

bonico et al., 2016).

For the reproduction of MongoDB bug 7974 using

our approach, we ﬁrst manually create the MongoDB

replica set, composed by three nodes. Then, we set up

the following sequence of elasticity changes, which

should drive MongoDB through the required elastic

behavior:

E =

4500, r

1500, r

3000, r

4500, r

Since we must deallocate a secondary member of

MongoDB replica set at elasticity change ec

, it is as-

sociated to a selective elasticity request (SER). The

SER queries MongoDB replica set’s members, using

MongoDB shell method db.isMaster, until ﬁnding a

member that is secondary.

In parallel to the elasticity changes, we set up two

events, e1 and e2, which respectively create a unique

index, and insert a new document in the replica set.

The e1 is associated to elasticity change ec

, a ready

state that follows the scaling-in state where a sec-

ondary member is deallocated. The e2 is associated to

elasticity change ec

, the last ready state. Both events

are scheduled without waiting time.

Table 4: MongoDB-7974 Event Schedule.

Elasticity Change Event ID Execution Sequence Wait Time

e1 1 0 s

e2 1 0 s

We repeat the bug reproduction for three times.

After each execution, we look for the expression ”du-

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

500

plicate key error index” in the log ﬁles. If the expres-

sion is found, we consider the bug is reproduced.

Table 5 shows the result of all the three executions,

either using our approach or not. All the attempts us-

ing our approach reproduce the bug, while none of the

attempts without our approach do it.

Table 5: MongoDB-7974 Bug Reproduction Results.

Reproduction Reproduced Not Reproduced

With Our Approach 3 0

Without Our Approach 0 3

For the executions without our approach, we force

MongoDB to elect the intermediate node (in the or-

der of allocation) as primary member

, what can oc-

casionally occur in a real situation. In this scenario,

regardless of the scale-in settings, cloud computing

infrastructure always deallocate a secondary member,

since Amazon EC2 only allows to deallocated the old-

est or newest nodes. Despite in this experiment we

force a selective elasticity, in a real situation without

using our approach the resource selection would not

be deterministic. For instance, the newest or oldest

node could be elected as a primary member. Even

though cloud computing infrastructures reproduces

the required elastic behavior, this bug still needs the

event executions, which must be correctly synchro-

nized. This is the why the bug cannot be reproduced

without our approach.

6 RELATED WORK

Several research efforts are related to our approach

in terms of elasticity control, selective elasticity, and

events scheduling. The work of Gambi et al. (Gambi

et al., 2013b; Gambi et al., 2013a) addresses elastic-

ity testing. The authors predict elasticity state transi-

tion based on workload variations and test whether

cloud infrastructures react accordingly. However,

they do not focus on controlling elasticity and cannot

drive cloud application throughout different elasticity

states.

Banzai et al. (Banzai et al., 2010) propose D-

Cloud, a virtual machine environment specialized in

fault injection. Like our approach, D-Cloud is able

to control the test environment and allows testers to

specify test scenarios. Test scenarios are speciﬁed

in terms of fault injection and not on elasticity and

events (as in our approach).

Yin et al. (Yin et al., 2013) propose CTPV, a Cloud

Testing Platform Based on Virtualization. The core

of CTPV is the private virtualization resource pool.

https://docs.mongodb.com/force-primary

The resource pool mimics cloud infrastructures en-

vironments, which in part is similar to our elasticity

controller. CTPV differs from our approach in two

points: (i) it does not use real cloud infrastructures

and (ii) it uses an elasticity controller that does not

anticipate resource demand reaction.

Vasar et al. (Vasar et al., 2012) propose a frame-

work to monitor and test cloud computing web appli-

cations. Their framework replaces the cloud elasticity

controller, predicting the resource demand based on

past workload. Contrary to our approach, they do not

allow to control a speciﬁc sequence of elasticity states

or events.

Li et al. (Li et al., 2014) propose Reprolite, a tool

that reproduces cloud system bugs quickly. Similarly

to our approach, Reprolite allows for the execution of

parallel events on the cloud system and on the envi-

ronment, but does not focus on elasticity, one of our

main contributions.

7 CONCLUSION

In this paper, we proposed an approach to reproduce

elasticity testing in a deterministic manner. This ap-

proach is based on three main features: elasticity con-

trol, selective elasticity, and event scheduling. We use

our approach to successfully reproduce a bug of Mon-

goDB, a popular open source system. Indeed, the bug

cannot be deterministically reproduced with state-of-

the-art approaches.

As testing is not only about reproducing existing

bugs, but also diagnosing them, a likely evolution for

our approach is to generate different test scenarios

combining elasticity state transitions, workload vari-

ations, selective elasticity, and event scheduling. An-

other feature we plan to investigate as future work is

the speediness of test executions. Deterministic re-

source allocation can accelerate state transitions and

thus optimize the number of executions per period of

time, and/or reduce execution costs. This is particu-

larly important when testing cloud systems: to per-

form our three experiments, we used ≈45 machine-

hours in Amazon EC2 . Finally, we also plan to apply

model-driven engineering to create an uniﬁed high-

level language. In the current implementation, the

writing of test cases involves a mix of shell scripts,

Java classes, and conﬁguration ﬁles, which are not

very suitable for users.

Making Cloud-based Systems Elasticity Testing Reproducible

501

REFERENCES

Albonico, M., Mottu, J.-M., and Suny

e, G. (2016). Con-

trolling the Elasticity of Web Applications on Cloud

Computing. In The 31st SAC 2016, Pisa, Italy.

ACM/SIGAPP.

Banzai, T., Koizumi, H., Kanbayashi, R., Imada, T.,

Hanawa, T., and Sato, M. (2010). D-Cloud: De-

sign of a Software Testing Environment for Reliable

Distributed Systems Using Cloud Computing Tech-

nology. In Proceedings of CCGRID’10, Washington,

USA.

Bersani, M. M., Bianculli, D., Dustdar, S., Gambi, A.,

Ghezzi, C., and Krsti

c, S. (2014). Towards the For-

malization of Properties of Cloud-based Elastic Sys-

tems. In Proceedings of PESOS 2014, New York, NY,

USA. ACM.

Cooper, B. F., Silberstein, A., Tam, E., Ramakrishnan, R.,

and Sears, R. (2010). Benchmarking Cloud Serving

Systems with YCSB. In Proceedings of SoCC’10,

New York, NY, USA. ACM.

Engstrom, E., Runeson, P., and Skoglund, M. (2010). A sys-

tematic review on regression test selection techniques.

Information and Software Technology, 52(1):14–30.

Gambi, A., Hummer, W., and Dustdar, S. (2013a). Auto-

mated testing of cloud-based elastic systems with AU-

ToCLES. In The proceedings of ASE’13, pages 714–

717. IEEE/ACM.

Gambi, A., Hummer, W., Truong, H.-L., and Dustdar, S.

(2013b). Testing Elastic Computing Systems. IEEE

Internet Computing, 17(6):76–82.

Herbst, N. R., Kounev, S., and Reussner, R. (2013). Elas-

ticity in Cloud Computing: What It Is, and What It Is

Not. ICAC.

Hunt, P., Konar, M., Junqueira, F. P., and Reed, B. (2010).

Zookeeper: Wait-free coordination for internet-scale

systems. In 2010 USENIX, Boston, MA, USA, 2010.

Li, K., Joshi, P., Gupta, A., and Ganai, M. K. (2014). Re-

proLite: A Lightweight Tool to Quickly Reproduce

Hard System Bugs. In Proceedings of SOCC’14, New

York, NY, USA.

Thiery, A., Cerqueus, T., Thorpe, C., Sunye, G., and Mur-

phy, J. (2014). A DSL for Deployment and Testing in

the Cloud. In Proc. of the IEEE ICSTW 2014, pages

376–382.

Vasar, M., Srirama, S. N., and Dumas, M. (2012). Frame-

work for Monitoring and Testing Web Application

Scalability on the Cloud. In Proc. of WICSA/ECSA

Companion, NY, USA.

Yin, L., Zeng, J., Liu, F., and Li, B. (2013). CTPV: A Cloud

Testing Platform Based on Virtualization. In The pro-

ceedings of SOSE’13.

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

502