Design of Scalable and Resilient Applications

using Microservice Architecture in PaaS Cloud

David Gesvindr, Jaroslav Davidek and Barbora Buhnova

Lab of Software Architectures and Information Systems,

Faculty of Informatics, Masaryk University, Brno, Czech Republic

Keywords:

Cloud Computing, Microservices, Architecture Design.

Abstract:

With the increasing adoption of microservice architecture and popularity of Platform as a Service (PaaS)

cloud, software architecture design is in many domains leaning towards composition of loosely interconnected

services hosted in the PaaS cloud, which in comparison to traditional multitier applications introduces new

design challenges that software architects need to face when aiming at high scalability and resilience. In

this paper, we study the key design decisions made during microservice architecture design and deployment

in PaaS cloud. We identify major challenges of microservice architecture design in the context of the PaaS

cloud, and examine the effects of architectural tactics and design patterns in addressing them. We apply

selected tactics on a sample e-commerce application, constituting of microservices operated by Azure Service

Fabric and utilizing other supportive PaaS cloud services within Microsoft Azure. The impact of the examined

design decisions on the throughput, response time and scalability of the analyzed application is evaluated and

discussed.

1 INTRODUCTION

Microservice architecture is becoming a dominant ar-

chitectural style in the service-oriented software in-

dustry (Alshuqayran et al., 2016). In contrast to tra-

ditional multitier applications where the role of soft-

ware components is played mainly by software li-

braries deployed and executed in a single process to-

gether with the main application, in microservice ar-

chitecture, individual components become truly au-

tonomous services (Fowler, 2014). There are multi-

ple advantages of this approach—change in a single

component does not require the entire application to

be redeployed, communication interfaces become ex-

plicit, and components become more decoupled and

independent of each other, as illustrated in Figure 1.

Separation of services into functions that can in-

teract via interfaces is not new, same as methods

to implement such separation in the framework of

service-oriented architecture (Sill, 2016). But as

emphasized by Sill (Sill, 2016), recent implementa-

tions of microservices in cloud settings take service-

oriented architecture to new limits. Possibilities of

rapid scalability and use of rich PaaS (Platform as a

Service) cloud services open new design possibilities

for microservices but also bring new threats for soft-

ware architects, with increasing difﬁculty to navigate

among the enormous number of available design op-

tions. This creates the need to examine the impact of

applicable design patterns in the context of microser-

vices and PaaS cloud.

Although some guidance on the microservice im-

plementation in the cloud exists, systematic support

for software architects interconnecting microservices

with other available PaaS cloud services (such as stor-

age and communication services) is not available,

leaving them to rely on shared experience with typ-

ically a single application scenario, without consider-

ing other strategies or alternative designs.

In this paper, we study different architectural deci-

sions which are being considered during microservice

architecture design in connection with the PaaS cloud.

As a contribution of this work, we elaborate on both

Deployed Multitier Architecture

Presentation

Tier

Application

Tier

Data

Tier

Compo

-nent

Compo

-nent

Compo

-nent

Compo

-nent

Compo

-nent

Compo

-nent

Shared Database

Deployed Microservice Architecture

Deployed

Service A

Presentation

Tier

Application

Tier

Service

Storage

Deployed

Service C

Presentation

Tier

Application

Tier

Service

Storage

Deployed

Service B

Presentation

Tier

Application

Tier

Service

Storage

Figure 1: Separation of components and their deployments

in a traditional multitier architecture and in microservice ar-

chitecture.

Gesvindr, D., Davidek, J. and Buhnova, B.

Design of Scalable and Resilient Applications using Microservice Architecture in PaaS Cloud.

DOI: 10.5220/0007842906190630

In Proceedings of the 14th International Conference on Software Technologies (ICSOFT 2019), pages 619-630

ISBN: 978-989-758-379-7

619

documented and undocumented design practices and

solutions, and study their effects, including identiﬁ-

cation of several surprising takeaways. As part of this

paper, we have designed and implemented a highly

conﬁgurable e-commerce application (an e-shop solu-

tion), which is designed in a way that its architecture

can be easily reconﬁgured to support thorough evalua-

tion of the impact of various design decisions on mul-

tiple performance related quality attributes (through-

put, response time and scalability). When designing

this application, we paid special attention to the se-

lection of real use cases and realistic architecture de-

sign, being overall as close as possible to a produc-

tion version of such an application. The highly con-

ﬁgurable architecture of this application gives us a

unique opportunity to provide comparison of multiple

versions of the same application using microservice

architecture to evaluate and isolate impact of differ-

ent design decisions, which is rarely seen in existing

work. Software architects can beneﬁt from our work

while designing their own applications when facing

the same design decisions. With the help of this work

they shall now be able to make better-informed deci-

sions and choose the right architectural patterns lead-

ing to desired quality of the application, or avoid un-

documented problems caused by chosen architecture

or used PaaS cloud services.

For the purpose of effective benchmark execution,

we also implemented an automated client application

that can reconﬁgure the deployed application, initial-

ize sample data seeding on the server based on user

requirements, execute any mix of workload, display

key performance metrics and export detailed perfor-

mance counters in JSON format. This application was

used for our benchmarks discussed in this paper.

For the implementation of the sample application,

we have decided to use Microsoft technologies and

cloud services. The application is developed in .NET

framework using Azure Service Fabric (Mic, 2018),

which is an open-source application platform for sim-

pliﬁed management and deployment of microservices

that can run in Microsoft Azure cloud, on-premise in-

frastructure and any other cloud infrastructure. This

platform was chosen because of its robustness (Mi-

crosoft uses internally this technology to operate large

scale services in Microsoft Azure, eg. Azure SQL

Database, Cosmos DB and others), and rich platform

services, which simplify microservice development.

On the other hand, the features that we used can be

manually implemented in other frameworks and the

same results can be obtained by hosting small web ap-

plications communicating with each other via REST

APIs, hosted in Docker and orchestrated by Kuber-

netes. Overall our results are generally valid indepen-

dently of the platform as the used patterns are plat-

form independent and can be applied also to differ-

ent cloud provider (Amazon, Google Cloud) offer-

ing container hosting services and managed NoSQL

databases.

Overall, we have evaluated 105 different bench-

mark scenarios using 4 cluster conﬁgurations (5, 10,

15, 20 nodes cluster) involving 4 different storage ser-

vices in the PaaS cloud, 2 communication strategies

(synchronous and asynchronous) and 11 design pat-

terns.

The paper is structured as follows. After the dis-

cussion of related work in Section 2, outline of the

background in Section 3, and outline of key architec-

tural decisions that inﬂuenced the separation of ser-

vices in Sections 4, Sections 5, 6 and 7 are dedi-

cated to the presentation and evaluation of architec-

tural concerned with service storage, communication

between microservices and application resilience. We

conclude the paper in Section 8.

2 RELATED WORK

When designing microservice architectures, software

architects are currently often relying on known de-

sign patterns and tactics (Gamma et al., 1995; Fowler,

2002), which are however not validated in the con-

text of microservices or PaaS cloud. Alderado et

al. point to an absence of repeatable empirical re-

search on the design and development of microser-

vice applications (Aderaldo et al., 2017). New design

guidelines for microservice architectures are emerg-

ing (Sill, 2016; Wolff, 2016; Nadareishvili et al.,

2016; Newman, 2015), which however do not con-

tain evaluated performance impacts of recommended

patterns on realistic implementations. At the same

time catalogs of design patterns for the design of PaaS

cloud applications are becoming available (Erl et al.,

2013; Wilder, 2012; Homer et al., 2014; Mic, 2017),

but without measured impacts of their combinations

and their use in a context of microservices. Vali-

dations of microservice architecture design patterns

are published by companies that have deployed mi-

croservices (Richardson, 2017; Net, 2015) and want

to share their experience with transition to microser-

vice design but not mentioning PaaS cloud deploy-

ment. Instead they focus on their currently deployed

architecture, its behavior and sometimes related per-

formance characteristics rather than transferable take-

aways. Due to the size of their projects, they cannot

afford to implement multiple variants of their applica-

tion using different design patterns and compare per-

formance of those to isolate the impact of used de-

ICSOFT 2019 - 14th International Conference on Software Technologies

620

sign patterns. This is where our work complements

the current state of the art via offering more guidance

for the actual decision making. A case-study eval-

uating the impact of transition from multi-tiered ar-

chitecture to microservice architecture on throughput

and operation costs is described in (Villamizar et al.,

2015), but not in the context of PaaS cloud, as it is de-

ployed to IaaS virtual machines. Challenges related

to transaction processing and data consistency across

multiple microservices are described in (Mihinduku-

lasooriya et al., 2016; Pardon et al., 2018). Criteria

for microservice benchmarks and a list of sample ap-

plications are presented in (Aderaldo et al., 2017), but

without optimization for PaaS cloud and its services.

3 MICROSERVICES IN PaaS

CLOUD

An indisputable advantage of operating microservices

in the PaaS cloud is the availability of a rich set

of complex ready-to-use services, providing software

architects with complex functionality, high service

quality (hight scalability and availability guaranteed

by SLA), low-effort deployment, and thus easy in-

tegration within the developed application. More-

over, multiple services are not even available for on-

premise deployment, or are costly to deploy and op-

erate with the same quality of service.

Microservices hosted in the PaaS cloud can bene-

ﬁt very well from cloud elasticity and measured ser-

vice (Mell and Grance, 2011), which allows us to eas-

ily scale individual services by allocating new com-

pute resources and pay only for the time when the

service instance is running. As part of low-effort de-

ployment of microservices, we can take advantage of

container orchestration as a service, which are ser-

vices provided by majority of the cloud providers,

used to manage and orchestrate applications deployed

in form of containers. Very often, it is a preconﬁgured

and fully managed Kubernetes cluster. And for ex-

ample in Microsoft Azure, the Azure Container Ser-

vice is not even billed. One needs to pay only for

the compute resources used to host the containers it-

self. To support rapid scalability, we do not need to

allocate virtual machines with the Kubernetes cluster

to host containerized applications. Instead, we can

take advantage of PaaS cloud container-hosting ser-

vices (e.g. Azure Container Istances), which is a fully

managed service providing per-second billing based

on the number of created instances, the memory and

cores selected for the instances, and the number of

seconds those instances are run. Such a rapid elastic-

ity allows us to scale microservices almost instantly

with very effective operation costs.

Microsoft Azure used for our implementation pro-

vides us also with the possibility to host Azure Ser-

vice Fabric cluster in form of a fully managed ser-

vice with very low deployment and maintenance ef-

fort. The cluster itself is deployed and operated at no

cost, we are only billed for virtual machines used to

host our services. New virtual machines can be easily

provisioned and released based on the overall utiliza-

tion of the cluster, to take advantage of cloud elasticity

and to optimize operation costs.

Despite all mentioned advantages of microservice

deployment to the PaaS cloud, there are associated

threats related to missing guidance on how to design

microservices in the PaaS cloud context. As there is

a very rich set of PaaS cloud services currently avail-

able that can be utilized by the microservice appli-

cation (storage, messaging, etc.) and have a direct

impact on the quality of the service, it becomes very

complex for a software architect to design the mi-

croservice application so that it meets all given quality

criteria. In this paper we compare and discuss mul-

tiple design choices and their impacts learned from

running over 100 experiments with variable software

architecture of our microservice application.

4 SERVICE DECOMPOSITION

DECISIONS

This section describes the key design decisions that

shaped the overall architecture of the designed appli-

cation and led to separation of the application into a

set of interconnected microservices, reﬁned from the

initial set of domain entities, identiﬁed using the Do-

main Driven Design principles (Evans, 2003).

4.1 Bounded Context

To split an application with a single data model into

a set of microservices, the Bounded Context design

principle suggests the division of large data model

into a set of smaller models with explicitly deﬁned

relationships. Proper application of the bounded con-

text principle is one of the major challenges when

designing a microservice architecture, as it becomes

very difﬁcult to ﬁnd the right balance between very

small microservices having a single data entity, and

services handling multiple entities that tend to ulti-

mately end up being too complex.

The following points characterize the advantages

of setting the bounded context small (Fowler, 2014):

explicit service dependencies, independent scalability

and high-availability.

Design of Scalable and Resilient Applications using Microservice Architecture in PaaS Cloud

621

On the other hand, the problems that arise with the

utilization of small bounded contexts are:

• Data integrity enforcement – Referential integrity

of entities stored at a single microservice can be

easily enforced at the storage level, but when re-

ferring entities are stored in different microser-

vices, it becomes very complex to guarantee that

the referenced entity exists.

• Cross-service queries – When the user wants to

access data that is distributed among multiple mi-

croservices, it is necessary to query all services

participating in the query and then combine re-

lated data, which is a complex operation and may

have negative impact on service response time as

shown in Section 5.6.

• Cross-service transactions – Distributed transac-

tions are generally complex to implement and

when transaction modiﬁes data across multiple

microservices, it requires the developer to im-

plement additional logic to ensure that transac-

tions on all services are either all committed or

all rolled back.

• Data Duplication – To overcome issues related

to cross-service queries, transactions and integrity

enforcement, frequently referenced entities can be

stored in multiple copies as part of multiple ser-

vices for the price of additional consistency man-

agement.

Our sample application consists of 7 microser-

vices (6 stateful, 1 stateless) depicted in Figure 2. We

designed every service to manage and store a single

domain entity. There are two exceptions we would

like to explain here:

• Product Service – manages product catalog,

which persists Product and Category entities.

We considered separation of these closely coupled

entities into isolated microservices, but because

they are referencing each other very often and at

the same time Categories are only referenced by

Products, we decided to store them in a single mi-

croservice.

• Sales Service – manages stored Orders (headers

with embedded items) for a speciﬁc user and at

the same stores a list of OrderItems for a speciﬁc

product. We decided to store sales data in a du-

plicate manner due to limits exposed by applied

partitioning.

4.2 Partitioning

Partioning is very often associated with the storage

layer (Homer et al., 2014) but in the context of mi-

croservices, partitioning can be propagated up to the

service interface depending on the storage technology

used. If the microservices are stateful, the use of the

partitioning at the storage level is highly advisable, so

that every node hosting the service only stores a spe-

ciﬁc portion of data based on the partition key. Selec-

tion of the partition key must be done very carefully

as the key will be then required by most of the service

methods to be able to determine which instance can

process the request, as depicted in Figure 3. At the

same time, requesting data across multiple partitions

becomes a very complex operation, as illustrated in

Figure 4, which needs to be minimized by design.

4.3 Summary of Recommendations

Data model of the application must be split into mul-

tiple microservices with adequate level of granular-

ity. Because of better consistency enforcement and

higher communication efﬁciency closely coupled en-

tities should be in the same microservice. An impor-

tant point to emphasize is that microservices should

also be separated at the storage level. Storage shared

by multiple microservices should be strictly avoided

by design, as it hinders independent scalability of mi-

croservices. The storage becomes a single point of

failure, with high risk of becoming a performance bot-

tleneck. To design highly scalable microservies, par-

titioning at the storage and compute level should be

applied.

4.4 Evaluation

The impacts of our partitioning strategy on scalability

of the designed application can be observed in Fig-

ure 9, which shows throughput of the REST API de-

pending on the size of the compute cluster and con-

ﬁrms that applied partitioning strategy leads to design

of a scalable microservice application.

5 STORAGE DESIGN DECISIONS

Selection of a storage technology or storage ser-

vice in the PaaS cloud has a signiﬁcant impact

on the throughput and scalability of the applica-

tion (Gesvindr and Buhnova, 2016a). As we expect

that even in case of microservice architecture the stor-

age tier of stateful service will signiﬁcantly inﬂuence

performance metrics of the service, we decided to

evaluate four different storage technologies that can

be utilized by our application, to assess how they will

limit service scalability and what the overall through-

put of the service in different scenarios will be. This

ICSOFT 2019 - 14th International Conference on Software Technologies

622

Stock Service

Checkout Service

Stock

Checkout Item

Review Service

Product Review

Reservation Service

Reservation

User Service

User

Address

Product Service

category

Product Service

Partition 0

Active Replica

Product Service

Partition 1

Active Replica

Product Service

Partition 2

Active Replica

2. Get partition reference

3. Connect and

execute service method

Product Service

Partition 0

Secondary Replica

Product Service

Partition 1

Secondary Replica

Product Service

Partition 2

Secondary Replica

Figure 3: Accessing partitioned data stored in Azure Ser-

vice Fabric stateful services.

service implements data partitioning and stores data

based on a partition key in separate partitions. When

the service is requested to load the data, Service Fab-

ric provides very simple API, which based on the par-

tition key opens connection to an instance of the ser-

vice that stores the partition and executes our code of

the microservice that loads data from the local stor-

age of the service partition as depicted in Figure 3.

A problem arises when we need to access data across

multiple partitions. Then the request is sent to a ran-

dom service partition, it loads local data and over net-

work requests data from other service partitions as de-

picted in Figure 4. This leads to delays in responses.

Efﬁcient partitioning is a key to the design of highly

scalable and efﬁcient storage using Reliable Collec-

tions. Partitioning keys we applied on our data model

are depicted in Figure 2.

REST API

Service

1. Get products

Product Service

Partition 0

Active Replica

Product Service

Partition 1

Active Replica

Product Service

Partition 2

Active Replica

2. Select random partition

3. Connect and

execute service method

Product Service

Partition 0

Secondary Replica

Product Service

Partition 1

Secondary Replica

Product Service

Partition 2

Secondary Replica

4. Connect to other partitions

and execute methods

Figure 4: Separation of data model into different microser-

vices.

5.5 Summary of Recommendations

Our experiments conﬁrm strong dependency of ap-

plication’s throughput on used storage technology.

None of the used storage technologies outperformed

others in all scenarios, therefore selection of stor-

age technology should be accompanied with bench-

marks of implemented microservices to determine if

selected technology meets required performance cri-

teria. Use of in-memory storage collocated to com-

pute resources (Reliable Collections) leads to great

scalability and lowest operation costs. Due to high

communication overhead observable in complex sce-

narios, we would recommend the use of microservice

architecture for scenarios where microservices do not

have to communicate frequently with each other.

5.6 Evaluation

We have evaluated the use of the four different stor-

age technologies with our microservice application.

The results of benchmark evaluating throughput of

microservices hosted on 5-node cluster (Azure Ser-

vice Fabric 6.0.232, node hosted on Azure Virtual

Machines D11 V2 2 cores, 14GB RAM, 100GB SSD

cache) depending on the used storage technology are

depicted in Figure 5. We evaluated ﬁve different

workloads because we expected that various storage

services will have different performance characteris-

tics depending on the type of the workload. The most

surprising and important result is that there is indeed

no single storage service that would outperform all the

others in all scenarios. Another fact is that throughput

of complex operations is very low due to high over-

head of cross-service communication, which is one

of the disadvantages of microservice architecture.

235

154

140

323

141

226

157

127

100

150

200

250

300

350

Simple read-only Complex read-only Simple write-only Complex write-

only

All scenarios

Average throughput (requests

per second)

Reliable Collections

Azure Table Storage

Azure Cosmos DB (Key-Value)

Azure Cosmos DB (Document)

Figure 5: Throughput of REST API hosted on 5-node clus-

ter with different storage services using synchronous com-

munication for different scenarios.

151

301

243

547

1326

534

753

654

261

162

221

277

167

200

400

600

800

1000

1200

1400

Simple read-only Complex read-only Simple write-only Complex write-

only

All scenarios

Average response time (ms)

Reliable Collections

Azure Table Storage

Azure Cosmos DB (Key-Value)

Azure Cosmos DB (Document)

Figure 6: Response time of REST API hosted on 5-node

cluster with different storage services using synchronous

communication for different scenarios.

To assess scalability of the evaluated storage ser-

vices, we deployed the same application on 20-node

cluster. The results of the benchmarks are depicted in

Figure 7. The benchmarks conﬁrm that our service is

scalable and its scalability is signiﬁcantly inﬂuenced

by the used storage service as the throughput increase

does not have the same ratio for all storage services

and scenarios. We also learned that the use of Azure

Cosmos DB is despite its high throughput very tricky,

because one needs to pay for reserved storage per-

formance (Request Units) and it is challenging to ad-

just performance of individual collections to achieve

best service performance without wasting allocated

request units.

Design of Scalable and Resilient Applications using Microservice Architecture in PaaS Cloud

625

One can see from our experiments that a single

service instance of the stateless Public API Service,

which resends client requests to individual services,

was not overloaded even for 20-node cluster, which

may be a sign of the used ASP.NET Core framework

efﬁciency.

776

116

280

256

525

154

192

703

199

394

784

582

415

200

400

600

800

1000

Simple read-only Complex read-only Simple write-only Complex write-

only

All scenarios

Average throughput (requests

per second)

Reliable Collections

Azure Table Storage

Azure Cosmos DB (Key-Value)

Azure Cosmos DB (Document)

Figure 7: Throughput of REST API hosted 20-node cluster

with different storage services using synchronous commu-

nication for different scenarios.

104

101

201

204

142

185

449

559

265

335

132

117

268

100

200

300

400

500

600

Simple read-only Complex read-only Simple write-only Complex write-

only

All scenarios

Average response time (ms)

Reliable Collections

Azure Table Storage

Azure Cosmos DB (Key-Value)

Azure Cosmos DB (Document)

Figure 8: Response time of REST API hosted on 20-node

cluster with different storage services using synchronous

communication for different scenarios.

We were further interested in the performance of

Reliable Collections because they are hosted as an in-

tegral part of Azure Service Fabric without any ad-

ditional costs. From the benchmarks we can conﬁrm

that performance of Reliable Collections is strongly

dependent on the size of the cluster as the workload

is evenly distributed on multiple nodes using parti-

tioning, as depicted in Figure 9. On the 20-node

cluster (Figure 7), read operations are outperforming

even very expensive Azure Cosmos DB (cost compar-

ison per request is depicted in Figure 10), especially

in complex scenarios. Unfortunately, the write op-

erations are slower due to complex data replication

among nodes to provide data redundancy.

235

398

570

776

116

154

207

241

280

143

186

256

200

400

600

800

1000

5 10 15 20

Average throughput (requests

per second)

Number of nodes in the cluster

Simple read-only Complex read-only

Simple write-only Complex write-only

All scenarios

Figure 9: Throughput of REST API hosted on 5, 10, 15

and 20 node cluster with reliable collections storage using

synchronous communication for different scenarios.

2,0

13,4

5,5

53,4

6,1

3,0

40,6

10,1

26,4

8,3

2,7

159,0

9,6

76,3

4,8

2,5

44,7

3,3

37,7

4,6

0,0

50,0

100,0

150,0

200,0

Simple read-only Complex read-only Simple write-only Complex write-

only

All scenarios

Cost per 1 million requests

[USD]

Reliable Collections

Azure Table Storage

Azure Cosmos DB (Key-Value)

Azure Cosmos DB (Document)

Figure 10: Cost per 1 million REST API requests hosted on

20 node cluster with different storage services using syn-

chronous communication for different scenarios.

6 COMMUNICATION STRATEGY

DESIGN DECISIONS

When handling communication between individual

services, there are two major strategies that can be

applied when microservices communicate with each

other:

• Synchronous – The service requests a response

from another microservice and waits for the re-

sponse before it continues in an execution.

• Asynchronous – the service send a request (mes-

sage) to another service using messaging service

and does not actively wait for the response. When

the response is generated, it is delivered through

the messaging service back to the original service.

Impacts of asynchronous messaging on multiple qual-

ity attributes of the application are well described

in (Taylor et al., 2009) and speciﬁcally for the PaaS

cloud environment in (Homer et al., 2014; Gesvindr

and Buhnova, 2016a). Due to different complexity

and frequency of communication between microser-

vices, we ﬁnd it to be desirable to validate impacts

of both communication strategies on throughput, re-

sponse time and scalability of the application. To this

end we implemented our sample application in such

a way that the communication strategy can be easily

switched thanks to adequate abstractions in its archi-

tecture.

6.1 Synchronous Communication

Strategy

The client application communicates with REST API

service, which redirects requests received via publicly

available REST API to individual microservices run-

ning in the Azure Service Fabric cluster. To discover

and communicate with services running in the cluster,

the following steps need to happen:

1. The location of the service needs to be resolved

– The service instance can be migrated between

different nodes of the cluster and can be running

ICSOFT 2019 - 14th International Conference on Software Technologies

626

Public Signal R

API Service

Review Service

Product Service

Stock Service

Reservation

Service

User Service

Sales Service

Checkout Service

Product

availability

check

Product

reservation

Product

price

computation

Change of

available

stock

Figure 11: Diagram of service interactions.

in multiple instances. If the service is stateless,

requests can be evenly distributed. If the service is

stateful, then the service partition should receive

only requests for which it does have locally stored

data.

2. Connection to the service – When the location of

the service is resolved by reverse proxy, which is

a service running in Azure Service Fabric cluster,

a direct connection can be opened between two

services hosted in the cluster and requested oper-

ation can be executed. When the connection fails,

there is a retry logic implemented as part of Azure

Service Fabric infrastructure.

Interactions between individual microservices and

types of requests are depicted in Figure 11.

6.2 Asynchronous Communication

Strategy

An alternative communication strategy does not open

direct connections between microservices hosted in

the cluster, but the service sends a request (message)

to another service using messaging service. When the

message is delivered, the request is processed and the

response is sent back to the messaging service and

is delivered to the service waiting for the response.

The advantage of this approach is a looser coupling

of the microservices as they do not rely on a deﬁned

communication interface but only on the format of re-

quest and response messages. The disadvantage is the

higher implementation complexity, thus higher imple-

mentation costs.

Messaging service is not part of Azure Ser-

vice Fabric, therefore we use the ServiceFab-

ric.PubSubActors library, which hosts a new mi-

croservice that works as a reliable messaging stateful

service internally using Reliable Queue as the storage

for messages. Other services then can subscribe to

receive messages of speciﬁed type, which are unfor-

tunately broadcasted across all service partitions if it

has more than one.

6.3 Summary of recommendations

Based on the results of the experiments, it is advisable

to use synchronous communication as a primary com-

munication pattern due to signiﬁcantly lower over-

head, higher throughput and faster response time.

Asynchronous communication is desirable for long

running operations or operations that need to be com-

pleted reliably (e.g. corrective actions described in

Section 7.1).

6.4 Evaluation

We implemented both communication strategies in

our sample application, which provided us with an

opportunity to evaluate and compare the behavior of

both communication strategies, which is rarely seen

in experience reports form industry on the same appli-

cation, as having both implementations is costly and

not suitable for large production applications.

235

154

10 10

100

150

200

250

Synchronous communication Asynchronous communication

Average throughput

(requests per second)

Simple read-only Complex read-only

Simple write-only Complex write-only

All scenarios

Figure 12: Throughput of REST API hosted on 5-node

cluster using synchronous and asynchronous communica-

tion with reliable collections storage for different scenarios.

Design of Scalable and Resilient Applications using Microservice Architecture in PaaS Cloud

627

7058

151

15971

5174

301

11465

243

5891

5000

10000

15000

20000

Synchronous communication Asynchronous communication

Average response time (ms)

Simple read-only Complex read-only

Simple write-only Complex write-only

All scenarios

Figure 13: Response time of REST API hosted on 5-node

cluster using synchronous and asynchronous communica-

tion with reliable collections storage for different scenarios.

The results of the benchmark are depicted in Fig-

ure 13. Despite the fact that synchronous calls are

considered harmful (Fowler, 2014) it is very sur-

prising how signiﬁcantly synchronous service com-

munication outperforms its asynchronous alternative,

which uses messaging as a form of reliable commu-

nication between services. The use of reliable mes-

saging services leads to increased availability as very

short outages of individual services are not propa-

gated to the client but based on our implementa-

tion and tests, this communication strategy for sim-

ple scenarios has four times worse throughput than

synchronous calls. In case of the complex scenarios,

the difference is signiﬁcantly lower, by which one can

conclude that asynchronous messaging is worth con-

sidering for long lasting complex operations where

high throughput is not required and reliability is more

important. Similar conclusions for PaaS cloud appli-

cations (not in context of microservices) are also men-

tioned in (Gesvindr and Buhnova, 2016a).

7 RESILIENCE DESIGN

DECISIONS

When designing a microservice architecture, one of

the biggest challenges is to enforce data consistency

across multiple services, and to manage cross-service

transactions. Software architects are nowadays pro-

vided with hardly any guidance on addressing these

design challenges in microservice architecture, we

therefore came with our own implementation, which

is a modiﬁed version of the compensation transaction

pattern (Homer et al., 2014) combined with an event

sourcing pattern (Homer et al., 2014) to handle re-

liable cross-service compensations. Our implemen-

tation does not increase complexity of successfully

executed operations, i.e. there were no performance

impacts found during benchmarks.

7.1 Cross-service Transactions

To deal with the mentioned issues, we implemented

a component called Event Sequence Source. This

component is used to describe a complex transaction

across multiple services, but the transaction itself is

split into multiple atomic blocks—the internal block

executes operations inside the service, the external

block wraps communication with another service. As

depicted in Figure 14, in the atomic block there are

actions to be executed and also pair actions to revert

changes if the transaction fails on any of the blocks.

This sequence of operations is executed exactly in the

order of their deﬁnition. We discovered that data vali-

dation should be executed as the initial action because

when this fails, no corrective actions are needed. As

no data integrity is enforced by the storage across ser-

vices, we run all necessary validations as part of the

transaction. When any of the operations fails, an ex-

ecution of corrective actions is initiated. Corrective

actions are implemented as always-succeed actions,

which means that they are stored in a highly reliable

queue to ensure that the action is repeatedly executed

until it succeeds. With this approach, the corrective

action overcomes even temporary service failures.

Our implementation of cross-service transactions

does not ensure as high level of consistency and atom-

icity as known from relational databases but it pro-

vides us with structured transaction description and

sufﬁcient guarantees for the purpose of our business

transactions with minimum performance impact.

7.2 Constraints Enforcement

Since the data is stored across multiple microservices

with isolated storages, it is not possible to enforce ref-

erential integrity at the storage level. Therefore this

needs to be enforced with transactions. Every transac-

tion as described in the previous section runs its own

data validation, during which it validates the existence

of referenced entities in different services.

Another issue we have addressed was how to gen-

erate unique identiﬁers of stored entities, because in

many current applications the identiﬁer of a record

(primary key) is generated by a relational database,

which cannot be followed in our project, and not

all storage services offer support for generation of a

unique identiﬁer of a record. Therefore, we gener-

ate globally unique identiﬁer using service code when

new entity is created in sour service code before it

is persisted. The uniqueness of the value is then en-

forced by the storage.

Data validation operations are implemented as a

part of the service code, mostly in a constructor of

an entity to prevent invalid entity from being instanti-

ated.

ICSOFT 2019 - 14th International Conference on Software Technologies

628

on failure

on success

on failure

on success

Check if user

exists

Conﬁrm order

Check if

reserved

products exist

Calculate total

price

Persist order

No correction

action

No correction

action

No correction

action

on success

Delete order

Persist Order Items

Decrease Stock Level

Delete Order Items

Increase Stock Level

For each order item:

Release

customer

reservations

No correction

action

Log success

No correction

action

on success

on failure

Figure 14: Transaction workﬂow using Event Sequence Source.

7.3 Handling Transient Errors

It is very important in the PaaS cloud to properly han-

dle transient faults (Gesvindr and Buhnova, 2016b) by

implementing a retry strategy so that when a cloud re-

source is not currently available, an error is not propa-

gated to the client, but instead the operation is retried

multiple times with an increasing delay. This is al-

ready implemented in majority of client libraries and

it just needs to be enabled.

7.4 Recoverability

To increase recoverability of the application in the

PaaS cloud, it is advisable to implement the Circuit

Breaker Pattern (Homer et al., 2014). This applies

also to microservice architecture as the pattern pre-

vents an application repeatedly trying to execute an

operation that is likely to fail without wasting re-

sources. It detects if the fault has been resolved and

then it gradually increases the load as more and more

requests are permitted to execute.

7.5 Summary of Recommendations

Cloud computing services frequently deal with very

short outages in duration of few seconds, therefore

adequate transient error handling policies in a form of

retry strategy needs to be implemented. Especially for

microservices, these outages could lead to costly roll-

back of cross-service transactions. Validity of data

must be enforced mostly at the application level, as

due to the use of isolated storage services constraints

enforcement cannot be applied at the storage level.

7.6 Evaluation

Based on our observations, none of the presented re-

siliency design decision has a measurable impact on

application performance for successful requests, as no

additional actions need to be executed. Thus the eval-

uation of these strategies with respect to the perfor-

mance metrics studied in this paper is not relevant.

However it might be interesting to study their effects

on resilience-motivated quality attributes, which are

out of scope of this paper.

8 CONCLUSION

In this paper, we have identiﬁed, discussed and eval-

uated a set of design principles that inﬂuence service

decomposition, storage, communication strategy and

resilience in microservice architecture deployed in

PaaS cloud. On the sample application, we measured

their impact and presented numerous ﬁndings, which

support the observation that microservice architec-

ture leads to high scalability, but brings new design

challenges further ampliﬁed by operation in the PaaS

cloud and richness of design choices that the archi-

tects have. Decomposition of the services needs to be

carefully validated, selection of the storage provider

cannot be done without knowledge of a speciﬁc work-

load and benchmarks, synchronous communication

strategy was found to perform way better despite rec-

ommendations in literature. Additional effort shall be

invested in extension of the studied design principles

and patterns for microservice design in the context of

the PaaS cloud.

Design of Scalable and Resilient Applications using Microservice Architecture in PaaS Cloud

629

ACKNOWLEDGEMENT

This research was supported by ERDF ”Cy-

berSecurity, CyberCrime and Critical Informa-

tion Infrastructures Center of Excellence” (No.

CZ.02.1.01/0.0/0.0/16 019/0000822).

REFERENCES

(2015). Why you can’t talk about microservices without

mentioning netﬂix. https://smartbear.com/blog/deve-

lop/why-you-cant-talk-about-microservices-without-

ment/.

(2017). Cloud design patterns. https://docs.microsoft.com

/en-us/azure/architecture/patterns/.

(2018). Microsoft service fabric. https://github.com/micro-

soft/service-fabric.

Aderaldo, C. M., Mendonc¸a, N. C., Pahl, C., and Jamshidi,

P. (2017). Benchmark requirements for microservices

architecture research. In Proceedings of the 1st Inter-

national Workshop on Establishing the Community-

Wide Infrastructure for Architecture-Based Software

Engineering, ECASE ’17, pages 8–13, Piscataway,

NJ, USA. IEEE Press.

Alshuqayran, N., Ali, N., and Evans, R. (2016). A system-

atic mapping study in microservice architecture. In

2016 IEEE 9th International Conference on Service-

Oriented Computing and Applications (SOCA), pages

44–51.

Erl, T., Puttini, R., and Mahmood, Z. (2013). Cloud Com-

puting: Concepts, Technology & Architecture. Pren-

tice Hall Press, Upper Saddle River, NJ, USA, 1st edi-

tion.

Evans, E. (2003). Domain-Driven Design: Tackling Com-

plexity in the Heart of Software. Addison-Wesley.

Fowler, M. (2002). Patterns of Enterprise Application Ar-

chitecture. Addison-Wesley Longman Publishing Co.,

Inc., Boston, MA, USA.

Fowler, M. (2014). Microservices a deﬁnition of this new

architectural term.

Gamma, E., Helm, R., Johnson, R., and Vlissides, J.

(1995). Design Patterns: Elements of Reusable

Object-oriented Software. Addison-Wesley Longman

Publishing Co., Inc., Boston, MA, USA.

Gesvindr, D. and Buhnova, B. (2016a). Architectural tac-

tics for the design of efﬁcient PaaS cloud applications.

In 2016 13th Working IEEE/IFIP Conference on Soft-

ware Architecture (WICSA).

Gesvindr, D. and Buhnova, B. (2016b). Performance chal-

lenges, current bad practices, and hints in PaaS cloud

application design. SIGMETRICS Perform. Eval. Rev.,

43(4).

Homer, A., Sharp, J., Brader, L., Narumoto, M., and Swan-

son, T. (2014). Cloud Design Patterns: Prescriptive

Architecture Guidance for Cloud Applications. Mi-

crosoft patterns & practices.

Mell, P. and Grance, T. (2011). The NIST deﬁnition of

cloud computing.

Mihindukulasooriya, N., Garc

ıa-Castro, R., Esteban-

Guti

errez, M., and G

omez-P

erez, A. (2016). A survey

of restful transaction models: One model does not ﬁt

all. J. Web Eng., 15(1-2):130–169.

Nadareishvili, I., Mitra, R., McLarty, M., and Amundsen,

M. (2016). Microservice Architecture: aligning prin-

ciples, practices and culture. O’Reilly Media, Inc.,

1st edition.

Newman, S. (2015). Building Microservices. O’Reilly Me-

dia, Inc., 1st edition.

Pardon, G., Pautasso, C., and Zimmermann, O. (2018).

Consistent disaster recovery for microservices: the

BAC theorem. IEEE Cloud Computing, 5(1):49–59.

Richardson, C. (2017). Who is using microservices?

Sill, A. (2016). The design and architecture of microser-

vices. IEEE Cloud Computing, 3(5):76–80.

Taylor, R. N., Medvidovic, N., and Dashofy, E. M. (2009).

Software architecture: foundations, theory, and prac-

tice. Wiley Publishing.

Villamizar, M., Garcs, O., Castro, H., Verano, M., Sala-

manca, L., Casallas, R., and Gil, S. (2015). Evaluating

the monolithic and the microservice architecture pat-

tern to deploy web applications in the cloud. In 2015

10th Computing Colombian Conference (10CCC),

pages 583–590.

Wilder, B. (2012). Cloud Architecture Patterns. O’Reilly

Media, 1st edition.

Wolff, E. (2016). Microservices: Flexible Software Archi-

tectures. CreateSpace Independent Publishing Plat-

form.

ICSOFT 2019 - 14th International Conference on Software Technologies

630