
applications, the variation can even approach up to 
60%.  
b) Different configurations: even in the 
existence of the same hardware however, the way 
this resource is configured plays a significant role in 
its performance. The same applies for software 
configurations (e.g. a DB instance over a virtual 
cluster) or variations in the software development.  
c) Multi-tenancy and obscure, black box 
management by providers: Cloud infrastructures 
deal with multiple different users that may start their 
virtual resources on the same physical host at any 
given time. However, the effect of concurrently 
running VMs for example (Kousiouris et al., 2011) 
significantly degrades the actual application 
performance. This is even more affected by the 
usage patterns of these resources by their virtual 
owners or their clients. Furthermore, consolidation 
decisions made by providers and that are unknown 
to the users may group virtual resources on the same 
physical node at any given time, without informing 
the owner.  
d)  VM interference effects. In (Koh et al., 2007) 
an interesting research investigates the performance 
interference for a number of applications in 
experimental virtual environments that were selected 
for classifying their behaviour using different 
metrics. The result from the research shows that 
combined performance varies substantially with 
different combinations of applications. Applications 
that rarely interfere with each other achieve 
performance to the standalone performance. 
However, some combinations interfere with each 
other in an adverse way. Furthermore, virtualization 
is a technology used in all Cloud data centers to 
ensure high utilization of hardware resources and 
better manageability of VMs. Despite the advantages 
provided by virtualization, they do not provide 
effective performance isolation.  
All these aspects plus the fact that Cloud 
providers are separate entities and no information is 
available on their internal structure and operation, 
makes it necessary to macroscopically  examine a 
provider’s behaviour with regard to the offered 
resources and on a series of metrics. This process 
should be performed through benchmarking, by 
using the suitable tools and tests. One of the key 
aspects is that due to this dynamicity in resource 
management, the benchmarking process must be 
iterated over time, so that we can ensure as much as 
possible that different hardware, different 
management decisions (like e.g. 
update/reconfiguration/improvement of the 
infrastructure) are demonstrated in the refreshed 
metric values, but also observe key characteristics 
such as performance variation, standard deviation 
etc. Finally, the acquired information should be 
represented in a machine understandable way, in 
order to be used in decision making systems. 
The aim of this paper is to provide such 
mechanisms to address the aforementioned issues. A 
benchmarking framework designed in the context of 
the FP7 ARTIST project is presented in order to 
measure the ability of various Cloud offerings to a 
wide range of applications, from graphics and 
databases to web serving and streaming. The 
framework has defined also a number of templates 
in order to store this information in a machine 
understandable fashion, so that it may be used by 
service selection mechanisms. What is more. we 
define a metric, namely Service Efficiency (SE), in 
order to rank different services based on a 
combination of performance, cost and workload 
factors. 
The paper is structured as follows. In Chapter 2, 
an analysis of existing work is performed. In 
Chapter 3 the description of the ARTIST tools for 
mitigating these issues is presented, while in Chapter 
4 a case study on AWS EC2 resources is presented. 
Finally, conclusions and future work are contained 
in Chapter 5. 
2  RELATED WORK 
Related work around this paper ranges in the fields 
of performance frameworks, available benchmark 
services and description frameworks and is based in 
the according analysis performed in the context of 
the ARTIST project (ARTIST Consortium D7.2, 
2013). With regard to the former, the most relevant 
to our work is (Garg, 2012). In this paper, a very 
interesting and multi-level Cloud service comparison 
framework is presented, including aspects such as 
agility, availability, accountability, performance, 
security and cost. Also an analytical hierarchical 
process is described in order to achieve the optimal 
tradeoff between the parameters. While more 
advanced in the area of the combined metric 
investigation, this work does not seem to include 
also the mechanism to launch and perform the 
measurements. Skymark (Iosup et al., 2012) is a 
framework designed to analyze the performance of 
IaaS environments. The framework consists of 2 
components – Grenchmark and C-Meter. 
Grenchmark is responsible for workload generation 
and submission while C-Meter consists of a job 
scheduler and submits the job to a Cloud manager 
AMulti-CloudFrameworkforMeasuringandDescribingPerformanceAspectsofCloudServicesAcrossDifferent
ApplicationTypes
715