Transparent Interoperability Middleware between Data and Service

Cloud Layers

Elivaldo Lozer Frac alossi Ribeiro, Marcelo Aires Vieira, Danie la Barreiro Claro and Nathale Silva

FORMAS (Research Group on Formalisms and Semantic Applications) - LASID - DCC - PGCOMP,

Federal University of Bahia, s/n Adhemar de Barros Ave., Ondina, 40.170-110, Salvador, Bahia, Brazil

Keywords:

Cloud Computing, Interoperability, Middleware, DaaS, DbaaS.

Abstract:

Over the years, many organizations have been using cloud computing services to persist, consume and provide

data. Models such as Software as a Service (SaaS), Data as a Service (DaaS), and Database as a Service

(DBaaS) are consumed on demand to serve a speciﬁc purpose. In summary, SaaS is a delivery model for

applications, while DaaS and DBaaS are models to provide data and database management systems on de-

mand, respectively. SaaS applications r equire additional efforts to access those data due to their heterogeneity:

Non-structured (e.g. text), semi-structured (e.g. XML, JSON ), and structured format (e.g. Relational Data-

base). Consequently, the lack of standardization from DaaS and DBaaS generates a lack of interoperability

among cloud layers. In this paper, we propose a middleware MIDAS (Middleware for DaaS and SaaS) to

provide transparent interoperability between Services (SaaS) and Data layers (DaaS and DBaaS). Our current

version of MIDAS concerns two important issues: (i) a formal description of our middleware and (ii) a joining

data f rom different DaaS and DBaaS. To evaluate our middleware, we provide a set of experiments to handle

functional, execution time, overhead, and interoperability issues. Our results demonstrate the effectiveness of

our approach to addressing interoperability concerns in cloud computing environments.

1 INTRODUCTION

The volume of digital data grows exponentially, with

an estimated total of 40 trillion gigabytes in 2020

(Gantz and Reinsel, 2012). Because these data need

to be stored and available both to consumers and to or-

ganization s, da ta manag e ment have been facing som e

challenges to handle the variety and amount d ata. The

cloud computing paradigm has emerged to ﬁll some

of these requirements, once it provides serv ic es with

high availability and data distribution (Me ll et al.,

2011). By 2020, nearly 40% of the ava ilable data will

be managed and stored by a c loud computing provider

(Gantz and Reinsel, 2012).

Authors in (Armbrust et al., 2010) deﬁne cloud

computing as a model that en a bles a ubiquitous and

on-demand network of applications, platforms, and

hardware, both provided as services. These services

are organize d into levels and consumed on demand

by users in a scheme of pay-per-use. Software as a

Service (SaaS), Data as a Service (DaaS), and Data -

base as a Service (DBaaS) are instances of service

types organized in cloud levels. SaaS is cloud ap-

plications made available to end users via the Inter-

net. DaaS provides data on dema nd through appli-

cation programm ing interfaces (APIs). DBaaS p rovi-

des database man a gement systems (DBMS) with me-

chanisms for organizations to store , access and mani-

pulate their d atabases (Hacigumus et al., 2002). Alt-

hough confusing, DaaS and DBaaS are different con-

cepts.

The emergence of Inter net of T hings (IoT), social

networks and the use of web-enabled devices su c h as

smartphones, laptops, and notebooks generate a huge

volume and variety of data (Armbru st et al., 2010).

Data ar e stored in non-structur e d, semi-structured or

structured databases. Governments, Institutions, and

Companies most use DaaS as a way to make their data

(expenses, budgets, economic or census data) availa-

ble to pub lic or private users ac ross the Internet (Ba-

routi et al., 2013).

The access to DaaS and DBaaS in different cloud

providers by SaaS applications needs, in most of the

cases, substantial efforts. This kind of situation hand -

les a lock-in problem due to the lack of inte ropera-

bility among cloud levels (Loutas et al., 2011; Silva

et al., 2013). For instance, if demographic researchers

need to make studies about census data provide d by

148

Lozer Fracalossi Ribeiro, E., Aires Vieira, M., Barreiro Claro, D. and Silva, N.

Transparent Interoperability Middleware between Data and Service Cloud Layers.

DOI: 10.5220/0006704101480157

In Proceedings of the 8th International Conference on Cloud Computing and Services Science (CLOSER 2018), pages 148-157

ISBN: 978-989-758-295-0

governments in different DaaS (and/or DBaaS), they

will face the difﬁcult to process these data due to the

lack of standards and consequently no interoperabi-

lity between SaaS and DaaS (and/or DBaaS). To ac-

complish this interoperability issue, we propose our

middleware called MIDAS ( Middleware for DaaS and

SaaS).

Our current version of M IDAS (MIDAS 1.8) is re-

sponsible for mediating the comm unication between

different SaaS, DaaS, and DBaaS. MIDAS makes

possible that SaaS applications retrieve data seam-

lessly on various cloud data sources sin ce our MI-

DAS mediates all communication between SaaS and

DaaS/DBaaS. Our version g uarantees acce ss to DaaS

regardless of modiﬁcations mad e to its API.

We propose in this paper a new enhanced ver-

sion of our middleware MIDAS to provide a trans-

parent interoperability among different cloud layers.

The current version of MIDAS (MI DAS 1.8) hand-

les two important issues: (i) a formal description of

our approach and (ii) a join clause to manipulate dif-

ferent data (DaaS a nd DBaaS) into a single query.

Some minor improvements were made in order to ad -

just our MIDAS, such as (i) recogniz ation of diffe-

rent data query structures sent by SaaS, such as SQL

and NoSQL queries; (ii) manipulate different DaaS

and DBaaS from statements such as join (SQL) and

lockup (MongoDB); (iii) manipulate different data

models returned by D aaS and DBaaS, such as JSON,

XML, CSV and table s; and (iv) return the result into

the required format by SaaS, such as JSON, XML,

and CSV.

We performed some experiments to evaluate our

novel approach, con sid ering fou r important issues:

Functional, execution time, overhead, and interopera-

bility. Our results demonstrated that our middleware

is effective, thus providing the desired results.

The remainder of this pap er is organized as fol-

lows: Section 2 presents the most re levant related

works; Section 3 describes our current version of MI-

DAS; Section 4 formalizes our middleware; Section

5 provides a set of experiments to validate our appro-

ach; Section 6 presents some results; and Section 7

conclud es with some envisioning work.

2 RELATED WORKS

Some close works were proposed to solve the lack

of interoperability. In medical ﬁeld, authors in (Park

and Moon, 2015) propose a solution for heterogene-

ous DBaaS that share medical data between different

institutions. However, this approach handles data th a t

follows the Health Level Seven (HL7) standards, thus

minimizing efforts regarding hetero geneity.

The authors in (Igamberdiev et al., 2016) present

a fra mework to solve problems in Big Data systems

in the ar ea of oil and gas. The g oal is to automate the

transfer of information between projects, identifying

similarities and differences. Their framework handles

only one da ta source per query, not allowing to merge

data from more than one source.

Considering a non- domain-sp e ciﬁc interoperabi-

lity solution, there are two related work: (Sellami

et al., 2014 ) an d (Xu et al., 2016). These proposals

do not deal with different types of NoSQL, nor en-

vision to handle NewSQL approaches. Besides, they

manipulate data sou rces witho ut join ing, and they do

not work with data provided by DaaS. It is noteworthy

that manipulating both DaaS and DBaaS is one of the

main advantages of our pro posal.

The cloud Interoperability Broker (CIB) is a solu-

tion to interoperate different SaaS (Ali et al., 2016).

This work was evaluated in a dataset through an ac-

tual application, but unlike our pr oposal, they do not

consider the inter operability between SaaS and DaaS.

Despite the fact that our prior a pproach (MIDAS

1.6) (Vieira et al., 2017), it had some limitations: (i)

Each DaaS m ust be manually inserted and updated;

(ii) DBaaS is not provided; and (iii) data were retur-

ned to SaaS only in JSON for mat.

Thus, to the be st of our knowledge, this is the ﬁrst

middleware that interoperates SaaS with DaaS and/or

DbaaS in cloud environments.

3 THE CURRENT MIDAS

The current MIDA S architecture is depicted in Fig.

1. This novel approach is composed of six compo-

nents. The Query Decomposer which receives a query

from SaaS and maps the statement into an internal

structure. Query Builder which receives the query de-

composed and builds a query to DaaS and /or DBaaS.

The Data Mapping c omponent which identiﬁes and

obtains data from different DBaaS. Dataset Informa-

tion Storage (DIS), that sets the information about

DaaS APIs. A Crawler which maintains DIS up-to-

date. Finally, the Result Formatter, which formats,

associates, and selects data before returning to SaaS.

The following co mponen ts were included or mo-

diﬁed to meet the goals of our curr e nt version: Data

Mapping, Query Builder, Re sult Formatter, and Cra-

wler. The other components, Query Decomposer and

DIS, both works similarly to our previous version

(Vieira et al., 2017).

The Data Mapping generates a DaaS from a

DBaaS based on a manually ma intained data dicti-

Transparent Interoperability Middleware between Data and Service Cloud Layers

149

Dataset Information

Storage

Figure 1: Our current MIDAS architecture.

onary. It identiﬁes the DBaaS in which the data is

stored and it obtains the requested data. DBaaS can

be tables, columns, grap hs, key-values or docu ments.

The Query Builder accesses multiple DaaS in a sin-

gle query if the query has a join statement (such as

SQL jo in or Mon goDB aggregation). In our cur-

rent version, th e Result Formatter receives either data

from DaaS and DBaaS and perf orms the merge of

such data, regardless the model. Finally, our Craw-

ler maintains the DIS information up- to-date, consi-

dering that DaaS providers can change the paramete rs

to conduct a query. Besides, the SaaS provider can

now indicate the desired form a t to return its result.

Our Crawler has a challengin g ro le in keep ing

DIS information up-to-date because of the DaaS.

DaaS is not standardized thus it can change fre-

quently. Moreover, they are usually distributed. Our

Crawler sear c hes for every DaaS API information

from its web page, e nsuring that the information does

not cause any h arm to the ap plications, in the c a se of

updating. It was developed to run re peatedly toward

search of different information from those persisted in

DIS. When this information is found to be uneven, it

is recorded in DIS.

Fig. 2 illustrates the MIDAS execution sequence

for a SQL query with the join statement that acces-

ses one DaaS and two DBaaS. In this example, SaaS

sends a SQL query to MIDAS, which performs the de-

composition (by Query Decomposer) and forward s to

the Query Builder. Query Builder accesses the DIS

and identiﬁes that the data is in one DaaS (daas1)

and two DBaaS (dbaas1 and dbaas2). Query Builder

builds the request to DaaS and asks the Data Map-

ping to connect to both DBaaS to get the rest of the

data. Each provider executes the request and returns

the result to the Result Formatter ( daas1, dbaas1, and

dbaas2 returned in CSV, table, and document for-

mats, r espectively). The Result Formatter receives the

data, performs the join, formats the r equested return

(JSON), and forwards to the SaaS.

4 FORMAL MODEL OF MIDAS

The formal model of MIDAS a ims to explain the com-

munication among its modules. The formalization

of MIDAS is based on canonical models (Schreiner

et al., 2015) with trees and sets of keys and values.

Deﬁnition 1 (MIDAS internal structure). The struc-

ture used internally by MIDAS (MIDASql) is a tuple

CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science

150

Figure 2: MIDAS execution sequence among one DaaS and two DBaaS through a join statement.

MIDASql = (mDIS,mSaaS,mDaaS), where: mDIS

is the ca nonical model of DaaS presented in DIS;

mSaaS is the canonical model that maps the query

(sent by SaaS); and mDaaS is the canonical model

that maps DaaS return(s).

In the following sections, each canonica l model is

detailed.

4.1 Canonical Model mDIS

Deﬁnition 2 (mDIS). The canonical mod el that sto-

res DIS information (mDIS) is a tuple mDIS =

root

,DAAS), where: N

root

is the name of the mo del;

and DAAS is a set of DaaS models (daas set).

Deﬁnition 3 (daa s). The canonical mode l for a

speciﬁc DaaS (daas ∈ DAAS) is a tuple da as =

root

daas

,K), where: N

root

daas

is the name of DaaS;

and K is a predeﬁned set of keys (k) for each DaaS,

where K = {domain, search

path, query, sort, limit,

dataset, records, ﬁelds, format}.

Deﬁnition 4 (k). A key k ∈ K is an information about

daas a nd it is deﬁned as k = (N

root

,i), where: N

root

is the name used to characterize a speciﬁc informa -

tion about daas, k.N

root

∈ K; and i is the information

about k, it can be empty, atomic, or multivalued.

Considering a hypothetical DIS with two DaaS

(NYC and v8), part of the canonical model mDIS can

be seen in Fig. 3: The main node stores the beginning

of subtrees, where each subtree stores the information

about a particular DaaS. Eac h node of level i stores in-

formation on the k level, immediately above.

4.2 Canonical Model mSaaS

Deﬁnition 5 (mSaaS). The canonical model mSaaS

converts the query (submitted by Sa aS) in a set with n

DIS

nyc v8

dataset records fieldssortquery

search_

path

x6 x7 x8x4x3x2

...

mDIS

daas

domain format

x1 x9

limit

Figure 3: Example of mDIS for two DaaS.

queries (to sent to DaaS), wh ere n indicates the num-

ber of relations in the query (e.g.: n = 2 indicates

join with 2 tables), n ≥ 1. The model mSaaS is a tuple

mSaaS = (N

root

), where: N

root

is the value of n;

and C

is a set of ﬁrst-level clauses (c

) used in the

mapping to identify qu eries and operations.

Deﬁnition 6 (c

). A ﬁrst-level clause c

∈ C

sto-

res speciﬁc information about a query OR about an

operation, and it is a tuple c

= (N

root

), where:

root

is the name that identiﬁes the query OR the

operation; and C

is a set of second-level clauses (c

)

used in the mapping to identify the query attributes

and operations. Some important observations: (i)

root

∈ {q

,...,q

, param} , where q

is an i-th

relation and param is a node for storing data about

join, order by and limit; an d (ii) given n, there are

n + 1 clauses c

Deﬁnition 7 (c

). A second-level clause c

∈ C

sto-

res information about clauses of a query OR clauses

of an operation, and it is a tuple c

= (N

root

,V ),

where: N

root

is the name that identiﬁes th e clause

of a query OR the clause of an operation; e V is

a set of values (v) for each c

. Some important

observations: (i) if c

represents a query q

, then

root

indicates j a ttributes ( j ≥ 0) of q

stored,

where N

root

∈ {Pro jection,Selection,Dataset}; (ii)

Transparent Interoperability Middleware between Data and Service Cloud Layers

151

if c

represents param , then N

root

indicates j at-

tributes ( j ≥ 0) of n relations, where N

root

∈

{OrderBy, Limit,TypeJoin,CondJoin, Return}; and

(iii) given n, there are 3n + 5 clauses c

Deﬁnition 8 (v). A value v ∈ V is an element repre-

senting information about c

. Depending on the c

, V

may be empty, atomic, or multivalued. Thus, V = ∅

or V = {v

,...,v

}, where: v

is the i-th value v

for c

; and w is the number of values v in the set V of

the key c

, i.e., v ∈ V .

For instance, the qu ery of Table 1 (presented in

SQL and NoSQL) generates the canonical model pre-

sented in Fig. 4; while the query in Ta ble 2 generates

the canonical model presented in Fig. 5.

Table 1: Example of a query in SQL and in NoSQL (Mon-

goDB) without join/aggregation.

SQL NoSQL (MongoDB)

SELECT name, w7.find(

(from)

age

{

‘age=10’

}

(where)

FROM w7

{

‘name’:1,

(select)

WHERE age=10 ‘age’:1

}

)

ORDER BY name .limit(10)

(limit)

LIMIT 10 .sort(

(order by)

{

‘name’:1

}

);

q1 param

Dataset

Selection

Projection

age=

age

mSaaS

name

CondJoin

TypeJoin

Limit

Order

10name

Return

json

Figure 4: Example of mSaaS for query in Table 1.

param

Dataset

Selection

Projection

age

mSaaS

Cond

Join

TypeJoin

Limit

Order

queens

ataset

ection

Projection

p one v

outer

ame

Return

name

Figure 5: Example of mSaaS for query in Table 2.

After mDIS and mSaaS are generated, it is neces-

sary to transform both canonical models into a set of

URLs to submit to DaaS. The data from DaaS is re-

ceived through a Uniform Resource Locator (URL),

MIDASql provides a mechanism to convert mDIS

and mSaaS into a set of URLs, the function genera-

teURLs(). Our fu nction has the following prototype:

“URLs generateURLs(mDIS, mSaaS)”. This means

that, given a mDIS and a mSaaS, generateURLs()

must returns a set of URLs, where: each URL is

a concaten ation sequence of mDIS and mSaaS ele-

ments; and the number of URLs is equal to the num-

ber of query relations (n, n ≥ 1 ), i.e., each q

(in

mSaaS) generate s URL

. For this, we assume that:

“+” is an ope rator that concatenates two strings (lite-

rals or variables); and ch(p) is a function that returns

the contents of the child(re n) of p node.

Thus, considering DSname = ch(q

.dataset), URL

is generated according to Fig. 6.

Figure 6: Concatenations that the generateURL() function

uses to generate URL

Considering the function generateURLs(), some

observations are important: (i) when ch(p) does

not return any element, the corresponding line p

in URL

must be disregarded; (ii) multivalued re-

sult of ch(p) is separated by commas; (iii) the last

two lines occ ur only for n = 1; an d (iv) for n ≥ 2,

ch(q

.Pro jection) must initially include the cor re-

sponding ch(param.CondJoin) if the junction attri-

bute is not part of the projection attribute set (i.e., if

ch(param.CondJoin) /∈ ch(q

.Pro jection)).

Given the mDIS of Fig. 7 and the

mSaaS shown in Fig. 4, the generateURLs()

generates the following URL: URL

<http://w7.com/api/w/?dsw=w7&rcw=name,age&

q=age=1 0& sort=name&rows=10>.

DIS

data

set

records i dssortquery

searc

pat

dsv rcv dvsvqvpi

domain ormat

ttp

v om

imit

data

set

records i dssortquery

searc

pat

dsw rcw dwswqwpi

domain

ormat

ttp

w7 om

json

imit

Figure 7: Example of mSaaS for query in Table 2.

On the other hand, given the same mDIS

from the previous example (Fig. 7) and the

mSaaS shown in Fig. 5, the generateURLs()

generates the following URLs: URL

<http://w7.com/api/w/?dsw=x7&rcw=b1,name,age&

CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science

152

Table 2: Example of a query in SQL and in NoSQL (MongoDB) with join/aggregation.

SQL NoSQL (MongoDB)

SELECT w7.name, db.w7.aggregate.([

(from)

w7.age, vz.phone

lookup:

{

from:‘vz’, localField:‘b1’,

(join)

FROM w7 foreignField:‘b2’

}}

LEFT OUTER JOIN vz

match:

{

w7.b1=‘queens’

}}

(where)

ON w7.b1=vz.b2

project:

{

w7.name:1, w7.age, vz.phone:1

}} (select)

WHERE w7.b1=‘queens’

sort:

{

w7.name:1

}}

(order by)

ORDER BY w7.name

limit:3

} (limit)

LIMIT 3 ]);

qw=b1=’queens’>; and URL

= <http://v z .com/api/

v/?dsv=vz&rcv=b2 ,phone>.

4.3 Canonical Model mDaaS

For each generated URL, the corresponding DaaS re-

turns the request dataset. Before sending the results to

SaaS, MI DAS pe rforms some operation s to make the

data “presentable”, such as join, order by, a nd lim it,

if applicable. This treatment is carried out employing

the canonical model mDaaS.

Deﬁnition 9 (mDaaS). The canonical m odel mD aaS

maps the output of n Da aS. DaaS sends a re-

turn (in the format described in the mDIS) for

each URL. If n = 1, th en mDaaS just converts

ch(DIS.ch(q

.dataset). f ormat) (format returned b y

DaaS) into ch(param.Return) (format desired by

SaaS), and the process is ﬁnalized. On the other hand,

when n ≥ 2 (i.e., if there is a join), then the relations

are mapped two- by-two, so that mDaaS generates n

canon ical mappings. In this case (n ≥ 2), mDaaS is a

tuple mDaaS = (N

root

,CJ), where: N

root

is the name

of the DaaS mo del (q

D, i is the i-th relation); and CJ

is a distinct set of ch(param.CondJoin) (c j) valu e s in

the corresponding relation.

Deﬁnition 10 (cj). An information c j ∈ CJ is a value

that the condition of join ch(param.CondJoin) assu-

mes in the corresponding relation, being c j a tuple

c j = (N

root

c j

,L), where: N

root

c j

is the name that iden-

tiﬁes the value c j; and L is a set of lists (l) with all

attributes tha t contain c j.

Deﬁnition 11 (l). A list l ∈ L contains all elements

of the same tuple in which c j is part, in the same or-

der of occurrence of the relation (considering from

left to right). The amoun t of l ∈ L is equal to the

amount of o ccurrences of c j in the relation, thus

l = {a

,...,a

}, where: a

is the i-th attribute a

for each l in c j; and m is the nu mber of attributes

a ∈ l.

Considering that the query in Table 2 (with join)

returns the two sets of data presente d in Table 3, the

canonical models (mDaa S) are shown in Fig. 8.

Table 3: DaaS returns for query presented in Table 2.

w7 vz

b1 name age b2 phone

1 Andrew 14 1 p1

2 Bruce 35 1 p2

1 Carl 13 30 p3

3 Dylan 34 2 p4

2 Erik 65 5 p5

mDaaS

Carlndrew

riBruce ylan

1 2

p2p1

p p p

Figure 8: Example of mDaaS for query in Table 2.

Once the mDaaS has been generated, the join

can be done. The next step depends on the value

of ch(param.TypeJoin). For this, in addition to the

functions already mentioned, we assume that: lch(p)

is a function that returns the last child of a p node; and

con(p

, p

) is a function that connects the node p

the node p

If ch(param.TypeJoin) = ‘le f t outer’, the join is

performed as follows:

a) ∀c j

∈ ch(q

D) e ∀c j

∈ ch(q

D),

con(lch(q

D.c j

), ch(q

D.c j

)), ∀c j

= c j

;

b) case c j

/∈ ch(q

.Pro jection), then (i) con(q

ch(q

D.c j

)) is performed and (ii) c j

is removed;

c) if there is ch(param.OrderBy), this node is sor-

ted;

d) if there is ch(param.Limit), this must be the total

of ch(q

D); and ﬁnally

e) q

D is converted to ch(p aram.Return) and it is

sent to SaaS.

Considering the mD aaS of Fig. 8, the execution

of the described steps should result in Fig. 9.

Transparent Interoperability Middleware between Data and Service Cloud Layers

153

mDaaS

CarlAndrew

14 13

Bruce

p2p1

Figure 9: Example of mDaaS for query presented in Table

2 after left outer join.

5 EVALUATION

To evaluate our middleware, we performed a set of

three experiments. The se experiments delimit the re-

lationship between SaaS and DaaS/DBaaS. A query

without join (or aggregation) statement connects one

SaaS to only one DaaS or one DBaaS provider (Ex-

periments 1 and 2). Queries with join (or with ag-

gregation) statements allow SaaS level to relate more

than one DaaS a nd/or more than one DBaaS provide rs

(Exper iment 3).

Firstly, we evaluate the overhead of our midd-

leware. We submitted 100 qu e ries directly to both

DaaS and DBaaS and, we compared the results with

MIDAS acc ess. Queries w e re performed to return

100, 1000, and 10000 rec ords. Secondly, we evalu-

ate wh e ther the query langua ge (SQL an d NoSQL)

inﬂuences the a ccess time to different data sources

(DaaS and DBaaS). Through MIDAS, we have sub-

mitted 10 0 queries: (i) With Mongo DB to Daa S; (ii)

SQL to DaaS; (iii) MongoDB to DBaaS; and (iv) SQL

to DBaaS. Thirdly, we evaluate the interoperability of

our proposal. In this experiment, we submit 100 que-

ries to mo re than one data source: (i) 2 DBaaS; (ii) 2

DaaS; and (iii) 1 DaaS and 1 DBaaS.

In experiment 1 we evaluated overhead; expe-

riment 3 we evaluated interoperability; and in all

experiments (1, 2 and 3 ) we evaluated function

and execution time. The average time (in ms) of

each task was registered by Apache JMeter tool

(http://jmeter.apache.org/).

5.1 Our Case Study

Our current MIDAS is based on ope n source

technologies that are found in any cloud with

PHP su pport. It was developed in Heroku c loud

(https://www.heroku.com/) because it is an open

cloud with sufﬁcient storage space and a complete

Platform as a Service (PaaS) for our project. To si-

mulate a SaaS provide r, we develop a Demographic

Statistics by NY Hospital’s web application based on

PHP. This we b application is hosted in Heroku SaaS

instance, and it can be accessed at <https://midas-

saas.herokuapp.com>.

Regarding DaaS serv ice level, thr ee different

DaaS providers are used to perform our tests

and experiments (P

: <https://goo.gl/7sVsZB>;

: <https://goo.gl/E4YmYH>; and P

<https://goo.gl/vJomwT>):

• P

: Transportation Sites, with 13600 instances

and 18 attributes;

• P

: Hospital Gene ral Information, with 4812 in-

stances and 29 attributes; and

• P

: Demographic Statistics By Zip Code, with

236 instances and 46 attributes.

The same dataset provided by DaaS were

persisted into two DBaaS: P

in JawsDB

(https://www.jawsdb.com/) and P

in mLab

(https://www.mlab.com/). The D BaaS are based

on MySQL and MongoDB, respectively. The

choice for MySQL and MongoDB was motivated

by being the most widely used Relational and

NoSQL available an d free (according to ranking

https://db-engines.com/en/ranking). Our application

(simulating SaaS) perfor ms a jo in between P

and P

5.2 Experiments

To evaluate our middleware, we performed three ex-

periments: (E

) overhead; (E

) performance of diffe-

rent queries; and (E

) data join and interop erability.

In the ﬁrst experiment, we submitted 100 queries

to both data sources (DaaS and DBaaS) with and wit-

hout MIDAS. We vary the number of reco rds returned

(100, 1000, and 10000). This a llows evaluating the

inﬂuence of MIDAS on the communication between

SaaS and DaaS/DBaaS. For this, in the ﬁrst experi-

ment we submit:

• 1 00 queries d irectly to Daa S provider;

• 1 00 queries to DaaS provider throug h MIDAS;

• 1 00 queries d irectly to DBaaS provider; and

• 1 00 queries to DBaaS provider through MIDAS.

As stated , we evaluated whether the qu ery lan-

guage inﬂuences acce ss time depending on the data

source. Thus, in the second exp eriment we submit:

• 1 00 MongoDB queries to the DaaS provider

through MIDAS;

• 1 00 SQL querie s to the DaaS provider through

MIDAS;

CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science

154

• 1 00 MongoDB queries to the DBaaS provider

through MIDAS; and

• 1 00 SQL queries to the DBaaS provider through

MIDAS.

Finally, our third experiment evaluates the intero-

perability of MIDAS. We estimate the average execu-

tion time required fo r MIDAS to relate data from dif-

ferent sources, through the join (or aggregation) sta-

tement. The association of the data was made through

a zip code ﬁeld. having in dataset P

the attribute as

Zip and in the da taset P

the attribute as Zip Code. For

this, we submit:

• 1 00 queries with join statem ent to two DaaS pro-

viders through MIDAS;

• 1 00 queries with join statement to two DBaaS

providers through MIDAS; and

• 1 00 queries with join statement to one D a aS and

one DBaaS providers through MIDAS.

6 RESULTS

In this section, we present the results of our experi-

ments, and we discuss them.

6.1 Results from Experiment 1

The results obtained from experiment 1 were classi-

ﬁed based on the value assigned to the query limit.

This value deﬁnes the number of records returned and

it was restricted up to 100, 1000 and 10000 data re-

cords.

Firstly, we submitted 100 queries to return 100

data records. In this case, Fig. 10 shows the average

of the execution time:

• 1 267.77 ± 276.22 ms for qu e ries witho ut MIDAS

to Da aS;

• 2 052.37 ± 2658.98 ms for queries through MI-

DAS to DaaS;

• 4 89.76 ± 367.30 ms for queries without MIDAS

to DBaaS; a nd

1 10 20 30 40 50 60 70 80 90 100

Figure 10: Return time (y-axis) for each of the 100 queries

submitted (x-axis) with a limit of 100 records.

• 2 128.15 ± 219.87 ms for queries through MIDAS

to DBaaS.

Secondly, we submitted 100 queries to return 1000

data records. In this case, Fig. 11 shows the average

of execution time:

• 1 372.92 ± 275.70 ms for queries without MIDAS

to DaaS;

• 3 071.09 ± 585.30 ms for queries through MIDAS

to DaaS;

• 8 96.51 ± 22.83 ms for quer ie s without MIDAS to

DBaaS; and

• 2 813.19 ± 198.26 ms for queries through MIDAS

to DBaaS.

1 10 20 30 40 50 60 70 80 90 100

Figure 11: Return time (y-axis) for each of the 100 queries

submitted (x-axis) with a limit of 1000 records.

Finally, we submitted 100 queries to return 10000

data records. In this case, Fig. 12 shows the average

of execution time:

• 7 917.02 ± 1045.84 ms for queries without MI-

DAS to DaaS;

• 3 5039.22 ± 1 420.75 ms for queries through MI-

DAS to DaaS;

• 4 260.8 ± 61.25 ms for quer ie s without MIDAS to

DBaaS; and

• 3 0023.41 ± 1 213.57 ms for queries through MI-

DAS to DBaaS.

1 10 20 30 40 50 60 70 80 90 100

Figure 12: Return time (y-axis) for each of the 100 queries

submitted (x-axis) with a limit of 10000 records.

Regarding the overhead caused by MIDAS, we

can observe that the average differences of direct que-

ries to DaaS and DBaaS, respectively, whe n compa-

red to the access through MIDAS were: (i) 42.4%

Transparent Interoperability Middleware between Data and Service Cloud Layers

155

and 368.8%, for 100 data records; (ii) 123.7% and

213.8%, for 1000 data records; and (iii) 342.6% and

604.6%, for 10000 records. Time values are affected

by (i) data trafﬁc on the Internet and (ii) MIDAS in-

frastructure. These results demonstrate that the algo-

rithms need optim iz a tions, not being the scope of this

work.

6.2 Results from Experiment 2

In this experimen t, we combine two query langua-

ges (SQL and NoSQL) with both sources (DaaS and

DBaaS).

As Fig. 1 3 sh ows, the following averages of exe-

cution time were obtained:

• 3 3569.03 ± 2663.39 ms for Mong oDB queries

through MIDAS to DaaS;

• 3 5039.22 ± 1 420.75 ms for SQL quer ies through

MIDAS to DaaS;

• 2 9415.03 ± 1065.52 ms for Mong oDB queries

through MIDAS to DBaaS; and

• 3 0023.41 ± 1 213.57 ms for SQL quer ies through

MIDAS to DBaaS.

1 10 20 30 40 50 60 70 80 90 100

Figure 13: Return time (y-axis) for each of the 100 queries

submitted (x-axis) from different languages to different data

sources.

We can observe that: (i) For access to DaaS,

SQL queries were 4.4% slower; while (ii) for DBaaS

access, SQL queries were 2% slower. The time dif-

ference between the two types o f queries is minimal,

not representing losses in the choice of which to use.

6.3 Results from Experiment 3

In this experiment, we performed a query with join

statements that access two different DaaS, two diffe-

rent DBaaS and one DaaS with one DBaaS.

Figure 14 depicts the average of the execution

time.

• 1 2357.08 ± 6831.42 ms for two DaaS providers;

• 1 26957.46 ± 5 5870.66 ms for two DBaaS provi-

ders; and

1 10 20 30 40 50 60 70 80 90 100

Figure 14: Return time (y-axis) for each of the 100 queries

(x-axis) with join (or aggregation) statement.

• 2 2707.84 ± 9324.02 ms for one DaaS and one

DBaaS

In this experim ent, we can observe that (i) the

average query tim e to 2 DBaaS is 459% slower than 1

DaaS and 1 DBaaS queries, and 927 .4% slower than

2 DaaS queries. The average time for qu e ries to 1

DaaS an d 1 DBaaS is 83.8% slower than quer ies to 2

DaaS. When using DBaaS, the time values are hig her

than those presented by DaaS, due to the proce ss of

accessing and processing the data in the D BaaS.

6.4 Discussions

Our case study evaluates MIDAS through its overhead

and different languages and data sources.

Despite the fact that the execution time was pro-

portion al to the submitted query, in the ﬁrst experi-

ment the results show that MIDAS inputs an extra

overhead regarding direct queries. This depreciation

was expected because of the new layer introduced be-

tween SaaS and DaaS. It is noteworthy that network

bandwidth, cloud providers, and latency might also

inﬂuence those results.

Considering DbaaS, we observed that the result

from a direc t access is more rapid than thro ugh MI-

DAS. In fact, MIDAS deals with DBaaS as a DaaS,

through the Data Mapping module.

The second exp e riment states that the language

used by a SaaS (i.e., SQL, NoSQL) does not inﬂu-

ence the query performance or the return time with

both data (i.e., DaaS, Dbaa S).

Finally, the third experimen t states that DbaaS

needs to be deeply analyzed. Despite the fact that the

join clause has a comp lexity O(n

) (2: number of data

source), the join execution time decreases the perfor-

mance in almost 1 minute. On the other hand, results

on DaaS were less than 23 seconds. We can state that

the beneﬁts of our approach to interoperate different

data sources by the use of join clauses outperforms

the time spent on gathering the results.

There is one threat of validity: all data sources

were public. Thus, ofﬂine data for a ny cau se can com-

promise our approach.

CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science

156

7 CONCLUSIONS AND FUTURE

WORK

In this paper, we propose a new version of MIDAS,

describing a formal model to provide interoperability

among cloud layers. We performed some experiments

to validate our results and to show the effectivene ss of

our proposal.

Our middleware requires a minimu m adaptation

from SaaS applications despite the complexity of de-

aling with interoper ability problem between applica-

tion services and heterogeneous data in cloud envi-

ronments. As contributions, unlike the previous ver-

sion (1.6) o ur solution (i) even promotes the joining

of data from different DaaS and DBaaS, enabling gat-

hering data from various data sources; (ii) automati-

cally populates and maintains updated the DIS; and

(iii) considers other SaaS return formats in addition

to JSON.

As a future work, we intend to continue improving

MIDAS by adding new characteristics, such as (i) re-

cognizatio n of SPA RQL queries and other types of

NoSQL; (ii) au tomate the Crawler for searching no-

vel DaaS and disambiguate data from heterog eneous

data sources, and (iii) improve algorithms for be tter

results.

ACKNOWLEDGEMENTS

The author s would like to thank FAPESB (Founda-

tion for Research Su pport of the State of Bahia) for

ﬁnancial support.

REFERENCES

Ali, H., Moawad, R., and Hosni, A. A. F. (2016). A Cloud

Interoperability Broker (CIB) for data migration in

SaaS. In 2016 IEEE International Conference on

Cloud Computing and Big Data Analysis (ICCCBDA),

pages 250–256.

Armbrust, M., Fox, A., Gr ifﬁth, R., Joseph, A. D., Katz,

R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A.,

Stoica, I., and Zaharia, M. (2010). A view of cloud

computing. Commun. ACM, 53(4):50–58.

Barouti, S., Alhadidi, D., and Debbabi, M. (2013).

Symmetrically-private database search in cloud com-

puting. In Cloud Computing Technology and Science

(CloudCom), 2013 IEEE 5th International Confe-

rence on, volume 1, pages 671–678. IEEE.

Gantz, J. and Reinsel, D. (2012). The digital universe in

2020: Big data, bigger digital shadows, and biggest

growth in the far east. IDC iView: IDC Analyze the

future, 2007(2012):1–16.

Hacigumus, H., Iyer, B., and Mehrotra, S. (2002). Provi-

ding database as a service. In Data Engineering, 2002.

Proceedings. 18th International Conference on, pages

29–38. IEEE.

Igamberdiev, M., Grossmann, G., Selway, M., and Stumpt-

ner, M. (2016). An integrated multi-level mo-

deling approach for industrial-scale data interoperabi-

lity. Software & Systems Modeling, pages 1–26.

Loutas, N., Kamateri, E., Bosi, F., and Tarabanis, K. (2011).

Cloud computing interoperability: The state of play.

In Cloud Computing Technology and Science (Cloud-

Com), 2011 IEEE Third International Conference on,

pages 752–757. IEEE.

Mell, P., Grance, T., et al. (2011). The NIST deﬁnition of

cloud computing.

Park, H.-K. and Moon, S.-J. (2015). DBaaS using HL7 ba-

sed on XMDR-DAI for medical information sharing

in cloud. International Journal of Multimedia and

Ubiquitous Engineering, 10(9):111–120.

Schreiner, G. A., Duarte, D., and Mello, R . d. S. (2015).

SQLtoKeyNoSQL: a layer for relational to key-based

nosql database mapping. In Proceedings of the 17th

International Conference on Information Integration

and Web-based A pplications & Services, page 74.

ACM.

Sellami, R., Bhiri, S., and Defude, B. (2014). ODBAPI:

a uniﬁed REST API for relational and NoSQL data

stores. In Big Data (BigData Congress), 2014 IEEE

International Congress on, pages 653–660. IEEE.

Silva, G. C., Rose, L. M., and Calinescu, R . (2013). A

systematic review of cloud lock-in solutions. In

Cloud Computing Technology and Science (Cloud-

Com), 2013 IEEE 5th International Conference on,

volume 2, pages 363–368. IEEE.

Vieira, M. A., Ribeiro, E. L. F., Rocha, W. S., Mane, B.,

Claro, D. B., Oliveira, J. S., and Lima, E. (2017). En-

hancing midas towards a transparent interoperability

between saas and daas. In Proceedings of the XIII

Brazilian Symposium on Information Systems, pages

356–363.

Xu, J., Shi, M., Chen, C., Zhang, Z., Fu, J., and Liu,

C. H. (2016). ZQL: A uniﬁed middleware brid-

ging both relational and nosql databases. In De-

pendable, Autonomic and Secure Computing, 14th

Intl Conf on Pervasive Intelligence and Computing,

2nd Intl Conf on Big Data Intelligence and C om-

puting and Cyber Science and Technology Congress

(DASC/PiCom/DataCom/CyberSciTech), 2016 IEEE

14th Intl C, pages 730–737. IEEE.

Transparent Interoperability Middleware between Data and Service Cloud Layers

157