Bootstrapping and Plug-and-Play Operations on Software Deﬁned

Networks: A Case Study on Self-conﬁguration using the SONAr

Architecture

Maur

ıcio Amaral Gonc¸alves

, Natal Vieira de Souza Neto

, Daniel Ricardo Cunha Oliveira

avio de Oliveira Silva

and Pedro Frosi Rosa

Faculty of Computing, Federal University of Uberl

andia, Uberl

andia, Brazil

Keywords:

Self-management, Self-driving, Intent-based, Autonomic Computing, SON, SDN, NFV, Future Internet, 5G.

Abstract:

The autonomous network concept has gained strength with the growing complexity of the current networks,

especially after the deﬁnition of 5G requirements and their key performance indicators. This concept is chal-

lenging to implement in legacy networks, and it has become feasible with the emergence of network soft-

warization, which enables the deployment of functionalities through a logically centralized control plane ab-

straction. The network softwarization simpliﬁes the management process, reduces operational costs (OPEX),

enhances protection against failures, and enables complex requirements such as high-performance indicators

and IoT support. The Self-Organizing Networks Architecture (SONAr) project uses cutting-edge technologies

and concepts such as SDN, NFV, and Machine Learning. It proposes a new architecture aimed at the design

of self-management in computer networks, oriented by declarative intents. In this work, we introduce the

SONAr project by describing its components and speciﬁcations. We also present a case study that shows the

self-conﬁguration property that includes bootstrapping and plug-and-play operations using SONAr compo-

nents focusing on strategies applicable to OpenFlow based networks. We explain the decisions made in the

implementation and present comparative results between them.

1 INTRODUCTION

Infrastructure for future computer networks will have

great use of Software Deﬁned Networking (SDN) and

Network Function Virtualization (NFV) (Ramirez-

Perez and Ramos, 2016), aiming to satisfy require-

ments from modern applications. The use cases

which should take the beneﬁts of these future net-

works are 5G, Internet of Things (IoT), health appli-

cations, low latency communications, massive video

streaming, and so on (Barona L

opez et al., 2017).

SDN and NFV bring many advantages to computer

and telecommunications networks, mainly related to

scalability, fast deployment of new protocols, and

programmable services (Cox et al., 2017). Unfortu-

nately, these approaches also bring some new man-

agement issues to the network.

https://orcid.org/0000-0002-6985-638X

https://orcid.org/0000-0001-5047-4106

https://orcid.org/0000-0003-4767-5518

https://orcid.org/0000-0001-7051-7396

https://orcid.org/0000-0001-8820-9113

In traditional networks (before SDN and NFV),

the conﬁguration was generally applied to infrastruc-

ture elements manually with low-level abstraction.

This strategy required highly complex planning to

provide communication services with many conﬁgu-

ration details per network element. At the same time,

the SDN and NFV conﬁguration bring new challenges

related to bootstrapping and control plane availability.

In mobile networks, operated by telecommunica-

tions companies, the concept of Self-Organizing Net-

works (SON), speciﬁed by 3GPP standards (3GPP,

2018b)(3GPP, 2018a)(3GPP, 2018a)(3GPP, 2018b)

deﬁnes the use of SON for network automation. The

idea of autonomic computing was also used in soft-

ware engineering and deﬁned by IBM (Ganek and

Corbi, 2003a).

However, SON functionalities used by mobile net-

works focuses only on the Radio Access Network

(RAN). The core of the network has neither standards

nor complete architectures to handle such self-* prop-

erties. The self-* refers to a set of properties such

as self-healing, self-conﬁguration, self-optimization,

self-protection, and so on (Boutaba and Aib, 2007).

Gonçalves, M., Neto, N., Oliveira, D., Silva, F. and Rosa, P.

Bootstrapping and Plug-and-Play Operations on Software Deﬁned Networks: A Case Study on Self-conﬁguration using the SONAr Architecture.

DOI: 10.5220/0009406901030114

In Proceedings of the 10th International Conference on Cloud Computing and Services Science (CLOSER 2020), pages 103-114

ISBN: 978-989-758-424-4

103

The ﬁrst step in developing an architecture for net-

work self-management is to understand the require-

ments of administrators and operators. This under-

standing is useful both for the design of the solu-

tion and for gaining the trust of administrators, who

fear the loss of control and dependence on tools in

the administrative process (Duez et al., 2006). In-

appropriate automation can increase the complexity

of management, what is known as “irony of automa-

tion” (Bainbridge, 1983), or even propagate errors on

a larger scale than with manual intervention. On the

other hand, it will be impossible to manually man-

age the complexity of future networks. Because of

this, autonomic computing should be explored to self-

manage the environments (Sanchez et al., 2014).

In this work, we introduce the Self-Organizing

Networks Architecture (SONAr), presenting its com-

plete speciﬁcation. SONAr’s design principle is

to have a vendor-independent solution, i.e., a self-

management product that can be deployed in any en-

vironment. Data centers, home networks, telecom-

munications core networks, enterprise networks, and

others can take advantage of using SONAr to apply

self-management in their infrastructure. SONAr aims

to use self-* properties to handle and manage infras-

tructure, hardware and software, and network com-

munication in general.

We also present a case study that explains the

self-conﬁguration property that includes bootstrap-

ping and plug-and-play operations. This experimen-

tal evaluation of the self-conﬁguration SONAr com-

ponents focuses on strategies applicable to OpenFlow

based networks. We explain the decisions made in

the implementation and present comparative results

between them.

The paper is structured as follows. Section 2

presents the background and related work. Section

3 shows the requirements and challenges of man-

agement in future networks. Section 4 presents the

SONAr solution with an overview of its main com-

ponents. Section 5 shows the results from our exper-

iments regarding self-conﬁguration, and ﬁnally, Sec-

tion 6 presents concluding remarks and future work.

2 BACKGROUND

The concept of Self-Organizing Networks(3GPP,

2018) has been explored by telecommunication net-

works standards (3GPP, 2018b)(3GPP, 2018a)(3GPP,

2018b)(3GPP, 2018a) as a way of dealing with the

growing complexity of the infrastructure and the need

for coexistence with legacy technologies (Feng and

Seidel, 2008). In the same way, autonomous com-

puting concept (Ganek and Corbi, 2003b) proposes

the automation of administrative and operational pro-

cesses as a way to reduce the complexity of com-

puting systems, minimize operational costs (OPEX)

and improve Quality of Service (QoS). The term is an

analogy to the autonomous nervous system, which is

the mechanism responsible for important bodily func-

tions (e.g., breathing and blood supply) without con-

scious intervention. Similarly, we believe that net-

works must be able to self-manage with minimal hu-

man intervention.

In this section we show how SDN (McKeown

et al., 2008) and NFV (Cui et al., 2012) could be

used together to apply SON and autonomic comput-

ing in the core network of future networks. At the end,

we present some related works, which have explored

some points, and correlate them with SONAr.

2.1 SDN and NFV

Since SDN has programmable and centralized con-

trol, the conﬁguration operations would be simpliﬁed.

The SDN provides a high-level abstraction which al-

lows the services deployment as a combination of

ﬂows and functions.

NFV is usually presented together with SDN as

a way to make provisioning for network functions

more ﬂexible. By considering this approach, the

SDN controller is usually deployed as a virtual func-

tion. Besides that, other Virtual Network Func-

tions (VNF), such as routers, gateways, ﬁrewalls etc,

need to work without interruptions. It means that

the SDN and NFV need to guarantee that the VNFs

are movable in the available infrastructure, keeping

routes and other network conﬁgurations (Cox et al.,

2017)(Barona L

opez et al., 2017).

Some tools and frameworks for management are

available in SDN. OF-Conﬁg is a protocol based on

NETCONF, designed for SDN management. The

protocol is restricted to OpenFlow. AdVisor is an-

other example, which provides a management inter-

face. These tools help the operation and implementa-

tion of management requirements, such as bootstrap,

isolation, planning, monitoring etc (Wickboldt et al.,

2015).

In future networks, the management keeps some

challenges such as provide transparent mobility, low

latency communication, minimal cost, and support-

ing millions of new devices.These problems need to

be solved or mitigated using advanced techniques,

like autonomic computing and self-management

(Barona L

opez et al., 2017).

CLOSER 2020 - 10th International Conference on Cloud Computing and Services Science

104

2.2 Related Work

SELFNET project (Neves et al., 2016) presents an

architecture for orchestration, administration and ac-

cess control in SDN and NFV environments. It in-

tegrates SDN, SON, NFV, cloud computing and Ar-

tiﬁcial Intelligence (AI) to create an autonomic net-

work management. Some sensors and actuators are

deployed in the control plane and data plane, with

the mission to execute the autonomic computing ac-

tions (monitoring, optimizing, recovery etc). SELF-

NET is a promising work in progress, designed for

5G networks. SONAr and SELFNET have some mu-

tual functions and could work together in the future,

since SONAr is designed to be independent of the net-

work type (5G, core network, home networks etc can

use the beneﬁts of the solution).

In (d. R. Fonseca and Mota, 2017) the authors ex-

plore fault management in SDN. This paper shows

the beneﬁts provided by SDN, and describes the new

failure points which were created with the SDN ap-

proach. It is important to know the different possible

failures: in the data plane, in the controllers, or in the

SDN applications.

In (Abdallah et al., 2018) the authors present

a framework for FCAPS (Fault, Conﬁguration, Ac-

counting, Performance, Security) model. The paper

shows the functions that can be assigned by the con-

troller, and those that should be assigned by the Man-

agement Plane. It also shows a brief description of the

use of OF-Conﬁg, instead of OpenFlow, for manage-

ment needs.

As described in this section, many groups around

the world have applied self-management through

SDN and NFV. There is no project completed in this

scenario (i.e. self-management in network core). Ad-

ministrators and network operators have many re-

quirements not met by the proposed frameworks.

3 SELF-MANAGEMENT

REQUIREMENTS FOR FUTURE

NETWORKS

This section introduces the main application require-

ments into future network environments, as well as

the requirements of network operators, that is, re-

quirements related to network management.

3.1 Future Internet Applications and

QoS Requirements

The Future Internet is being developed as a response

to the strict requirements and complexity of new ap-

plications – some of which are already being explored

in a limited scope. Some good examples are VR/AR

applications, drone control (together with 4K video

transmission), tactile Internet, in-cloud gaming, au-

tonomous driving, among others. All of these cases

have a certain degree of high bandwidth and through-

put as a common requirement, and low latency re-

sponse is crucial to their successful implementation,

especially when considering real-time interaction. In

these particular cases, the senses of participating hu-

mans must be taken into account. For example, audi-

tory interactions require a response time of 100 ms,

while the visual reaction requires about 10 ms and

tactile feedback, round trip delay time less than 1 ms

(Simsek et al., 2016).

For a VR/AR total immersive experience indistin-

guishable from the reality, a network throughput of

5.2 Gb/s would be required for each user, albeit in a

more realistic approach for practical applications, a

throughput of 100-200 Mb/s with a latency of 13 ms

give most people a good immersive experience with-

out a feeling of motion sickness (Bastug et al., 2017).

Another important issue is the network reliability.

Some 5G typical applications demands less than 10

−7

failure rate, which corresponds to 3.16 seconds of un-

availability per year (Simsek et al., 2016).

In general, the main requirements for a new gen-

eration network – e.g. high ﬂexibility, plasticity, eas-

iness of (re)conﬁguration, implementation and con-

trol, reliability and cost-effectiveness – are not reach-

able using the legacy or current working network pro-

tocols.

3.2 Main Requirements of Real

Operators and Telecommunications

Industry

Implementing a self-managed SDN network is not

an easy task. From the network operator’s point of

view, several challenges remain, if aspects such as

backward compatibility, choice of deployment strat-

egy, security issues, and especially business cases are

closely observed (Zaidi et al., 2018). However, when

considering the beneﬁts of a Self-Organizing Network

in the Future Internet scenario and the limitations al-

ready discussed, it is very clear that a SDN approach

combined with SON architecture is a smart and viable

solution. The main requirements presented by net-

Bootstrapping and Plug-and-Play Operations on Software Deﬁned Networks: A Case Study on Self-conﬁguration using the SONAr

Architecture

105

work operators are: self-conﬁguration, self-healing,

self-optimization and self-protecting. Without these

requisites, it will be difﬁcult - if not impossible - to

successfully deploy the Future Internet applications

in a commercial network:

• Self-conﬁguration: Setting up network elements

usually means a lot of work, done by skilled en-

gineers with a deep understanding of the oper-

ator’s network, every time a new Network Ele-

ment (NE) is put into service or moved from one

place to another. It is not difﬁcult to predict the

amount of work that will be needed only to con-

ﬁgure the radio access network – including front,

mid and backhaul – of a complex and high den-

sity 5G macro/small/pico cells network of a city

such as Beijing or S

ao Paulo. The Future Internet

will require plug-and-play solutions not only for

the deployment of a complex system, but also to

make its plasticity feasible;

• Self-healing: Ultra-high reliability will be a key

issue in the Future Networks. Nowadays, the fault

recovery management and recovery rely on the

traditional distributed protocols – such as EIGRP

or OSPF – to handle failures, which can not pro-

vide the required 99.999% reliability and may

lead to some important issues (d. R. Fonseca

and Mota, 2017). A quick response to faults

is an indispensable aspect to be observed in a

new network architecture, and self-healing is an

important feature to be considered in any Self-

Organizing Networks architecture proposal and

an indisputable requirement for the telecommuni-

cations industry;

• Self-optimization: self-optimization process is a

response to the increasing complexity of today’s

networks. As they become bigger, interconnected,

multi-vendor and, thus, more and more complex,

the work to achieve efﬁciency also becomes more

difﬁcult bordering on the impractical. Until a

few years ago, self-optimization tools were re-

stricted – and even full-blown – to mobile radio

access networks (RAN). The concept was ﬁrst

introduced in 3GPP Rel. 8 (Moysen and Giup-

poni, 2018), as an answer to the huge amount

of network parameters that had to be managed

by the RAN optimization engineers. With SON

being introduced in the access and transport net-

works, it is a natural path that self-optimization

process should be also part of the evolution, as it

will increase the system efﬁciency, thus reducing

CAPEX/OPEX on nodes and links;

• Self-protection: Intrinsic self-protecting systems

are also an important feature in any SON archi-

tecture. The main types of threats are: DDoS at-

tacks, user to root (U2R) attacks, remote to local

(R2L) attacks and probe attacks, and a number of

router-based defense mechanisms have been pro-

posed to detect and prevent them (Hariri et al.,

2005). Needless to afﬁrm that an inherent, fast-

converging and intelligent self-protection system

will be indispensable for future networks man-

agement and a great demand from the operators

point-of-view.

The background and the requirements for future

networks already presented give the basement for

SONAr. Section 4 brings the new architecture design,

the main components speciﬁcation (and their integra-

tion), and the layer placement of the solution.

4 SONAr DESIGN AND

SPECIFICATION

Figure 1 presents the SONAr design, its components

and their distribution by network layers. The main

modules are described in this section.

4.1 Self-Organizing Entities

The Self-Organizing Entities (SOEs) are responsible

to perform the basic algorithms related with the au-

tonomic computing fundamentals. The intelligence

to heal, optimize, conﬁgure, protect, and plan is per-

formed by these entities. Usually, the Network Event

Manager (NEM) receives events from all components

of the architecture. The Self-Organizing Entities sub-

scribe to speciﬁc topics on NEM, and then receive the

events related with self-* properties.

Self-Healing Entity (SHE) performs algorithms to

predict or detect faults in all of the components de-

picted in Figure 1. In case of failures, it is necessary to

recover from them. Note that faults in the Infrastruc-

ture Layer within SDN elements (such as OpenFlow

switches) are usually handled by the SDN controller.

Unfortunately, some legacy devices are not covered

by controllers, and need special attention. In addi-

tion to the infrastructure, SHE needs to detect fail-

ures in the Control Layer, such as bugs or crashes

in controllers, Virtual Infrastructure Manager (VIM),

VNF managers etc. SONAr components may also

fail, and SHE must have recovery functions. Further-

more, SHE itself may fail and need to recover itself as

soon as possible.

The Self-Conﬁguration Entity (SCE) works at the

time of the network bootstrap and the NE plugging. In

bootstrapping, the SCE conﬁgures all ﬂows within the

CLOSER 2020 - 10th International Conference on Cloud Computing and Services Science

106

Figure 1: SONAr Design.

NE to move the network from an out-of-service state

to a running state. When the network is running, the

SCE works at the moment a new NE is plugged in the

environment. In current infrastructures, an operator

physically connects the new NE and conﬁgures the

NE and its neighbors, e.g., the address (usually IP),

host name, and routes are manually conﬁgured. The

purpose of the SCE is to be responsible for discover-

ing these settings and executing them automatically.

SCE is also responsible for the safety of new com-

ponents. In traditional architectures, when a new

equipment is connected to the network by an operator,

it is discovered by the topology and begins to func-

tion. Routing protocols, such as OSPF, start creating

routes for this new equipment. Security may become

a problem in this scenario if this new equipment is not

valid or authorized. With SONAr, the SCE needs to

guarantee the identity of new components.

The Self-Optimization Entity (SOPE) subscribes

statistical topics and analyzes possible network im-

provements. Considering a SDN and NFV environ-

ment, the improvements are usually related to real-

location of resources. In the Infrastructure Layer, if

the SDN is an OpenFlow implementation, SOPE can

rearrange the ﬂow rules within the OpenFlow switch

to use the resources intelligently. Network slice man-

agement is also covered by SOPE. Note that SOPE

collaborates with the Diagnostics Self-Learning En-

tity (DSLE). Almost all improvements are made after

the analysis of the history. Differentiation between

DSLE and SOPE functions is important: DSLE per-

forms AI algorithms while SOPE receives DSLE out-

comes and performs algorithms to improve the use of

network resources and infrastructure.

A common example of improvement is to avoid

congestion: if the DSLE identiﬁes that an NE or a port

will be congested in the future, it will notify SOPE to

perform an optimization procedure. SOPE will then

analyze the event sent by DSLE and rebalance ﬂows

between routes.

The Self-Protection Entity (SPE) ensures the pro-

tection of all layers in Figure 1 against attacks. SPE

has the services prepared for isolation and recovery.

The ﬁrst step is to isolate the compromised compo-

Bootstrapping and Plug-and-Play Operations on Software Deﬁned Networks: A Case Study on Self-conﬁguration using the SONAr

Architecture

107

nents, preventing attacks from reaching other compo-

nents. After isolation, the SPE will recover the at-

tacked component, clearing the remains of the attack,

and then reinserting the clean component into the in-

frastructure. Like SOPE, SPE is also based on diag-

noses made by DSLE.

Architectures and frameworks in academy and in-

dustry have created mechanisms to protect the Infras-

tructure Layer. The current challenge is to ensure

the protection of all layers shown in Figure 1. The

Control Layer has components that can be attacked,

such as the SDN controller: once it has control of any

network topology, a successful attack on this compo-

nent gives the attacker control of all infrastructure. In

addition, the SPE needs to isolate attacks on the In-

frastructure Layer, that is, attacks on an NE must be

restricted to that NE and can not reach the Control

Layer. Attacks on the Self-Management Layer and

Administration Layer follow the same analogy.

The Self-Planning Entity (SPLE) works in con-

junction with all Self-Organizing Entities and aims to

schedule actions of other entities. The main idea is to

realize changes that will be made in the future and to

schedule them to be made in a timely manner. An-

other basic function is to ensure that repetitive actions

of SHE, SCE, SPE and SOPE are applied.

SHE, SCE, SPE, SOPE and SPLE are separate en-

tities that perform services that can interact. These

interactions can eventually cause inconsistencies. For

instance the optimization: SOPE could determine an

improvement in the network and to make this im-

provement, a route would need to change from one

port to another in a given NE. It may be that the SPE

blocked this target port in advance to prevent an at-

tack. SOPE’s decision may eventually replace the

previous decision of the SPE. The Self-Orchestration

Entity (SORE) was designed to control the actions

created by other Self-Organizing Entities.

The SORE will receive all the actions of other en-

tities and will organize them in a queue, prioritizing

the actions. The operator could previously determine

the priorities in the SONAr dashboard. To avoid in-

consistencies, SORE works with the concept of trans-

action: if a ﬂow rule in an NE is initiated, other ﬂow

rules in the same register are queued. After the ﬁrst

ﬂow rule is completed, the SORE will execute the oth-

ers. Rollbacks are also performed by the SORE.

4.2 Self-Learning Entities

The Self-Learning Entities (SLEs) are responsible

for providing AI based mechanisms to overcome the

absence of human intervention. These entities use

knowledge bases that can be built dynamically in an

action/consequence analysis, manually in a guided

learning strategy or imported from third parties simi-

larly to antivirus bases (which allows the easy set-up

for new networks). SLE implements machine learn-

ing algorithms to provide scenario prediction, param-

eter adjustment, problem/opportunity diagnosis, be-

havioral pattern detection, infrastructure classiﬁcation

and translation of intentions. These activities are usu-

ally complex and slow, and therefore need to be ad-

dressed by speciﬁc mechanisms separated from Self-

Organizing Entities.

The Prediction Self-Learning Entity (PSLE) ap-

plies the K-Nearest Neighbors (KNN) algorithm to

classify metrics, services, time and topology to pro-

vide a discrete representation of the scenario.

The Tuning Self-Learning Entity (TSLE) provides

hyper-parameter optimization for all SONAr compo-

nents, including those composing SLE. For this, it

monitors the quality of the scenarios, evaluation met-

rics and service levels through a classiﬁcation algo-

rithm such as the KNN.

The Diagnostic Self-Learning Entity (DSLE) im-

plements a correlation algorithm for root cause anal-

ysis, service level assurance, security issues detection

and network degradation diagnostics. This is done

by analyzing the metrics and thresholds, verifying the

occurrence of alarms and monitoring log patterns and

events. The knowledge base for DSLE can be con-

structed with administrator-driven learning or deep-

learning based on data patterns of previous correct di-

agnostics. This entity creates events in the NEM with

the speciﬁc diagnosis and all data available.

The Rating Self-Learning Entity (RSLE) evalu-

ates devices, links, services, components and algo-

rithms based on their performance, costs, usage statis-

tics, current occupation, capabilities, and service level

history for each of the supported policies. With a ma-

ture knowledge base, RSLE can be used to deﬁne the

best infrastructure elements to be used at a speciﬁc

time to conﬁgure a service according to its policies.

The NLP Self-Learning Entity (NSLE) imple-

ments the Intent Translator through Natural Language

Processing (NLP) and Neural Networks. NSLE can

infer policies, functions and ﬁlters from a service

based on Declarative Intents expressed in natural lan-

guage. The knowledge base for translating intentions

can be constructed from guided learning or service

description analysis (which is basically an intention

deﬁned in a formally-described service). NSLE also

implements the feedback delivery aspect of the ser-

vice validator, which is done by analyzing metrics

such as RSLE and DSLE.

The Behavior Pattern Self-Learning Entity

(BSLE) is responsible for detecting network usage

CLOSER 2020 - 10th International Conference on Cloud Computing and Services Science

108

patterns and identifying the types of content being

transmitted. This is an important feature for self-

management since it makes the network sensitive to

changes in context and allows activation of services

under speciﬁc conditions.

4.3 Collecting Entities

The Collecting Entities (CoEs) are responsible for

collecting data from the infrastructure and all other

components of the architecture. This data is useful

for monitoring the state of the network and providing

the guarantee of the level of service.

These entities are also responsible for trigger-

ing events when: new data becomes available, data

changes, limits are reached and reportable patterns

are detected. Basic events can be used by SLEs and

Self-Organizing Entities, which can use raw data as

well as analyze and make decisions. The implemen-

tation of each CoE varies according to the speciﬁcity

of the infrastructure, which can occur through pro-

prietary protocols such as OVSDB and CDP, spe-

ciﬁc management systems, control components such

as the MANO and SDN controller, CLI robots using

ssh/telnet, or open protocols such as SNMP, LLDP,

and syslog.

Each Collecting Entity implements a speciﬁc al-

gorithm to collect and process the related data type.

The Metrics Collecting Entity (MCoE) collects met-

rics from devices, links and servers through speciﬁc

protocols, monitor thresholds and trigger events. The

Topology Collecting Entity (TCoE) applies discovery

algorithms, monitor changes on topology and trigger

speciﬁc events when it occurs. The Alarms Collect-

ing Entity (ACoE) collects events with an active or

reactive strategy, for example by using SNMP traps

or data polling, and triggers more speciﬁc events such

as those deﬁned by the DSLE. The Samples Collect-

ing Entity (SCoE) uses strategies like Deep Packet

Inspection (DPI), port mirroring and network sniff-

ing to infer content types from a particular instance

of communication, which can be used to enable spe-

ciﬁc services and to detect usage anomalies or secu-

rity issues. The Results Collection Entity (RCoE) is

responsible for conducting periodic examinations of

resources and components in order to collect speciﬁc

data that cannot be obtained otherwise, such as la-

tency and availability. Finally, the Logs Collecting

Entity (LCoE) collects logs and looks for predeﬁned

patterns (speciﬁed by administrators or imported from

an existing knowledge base) and ﬁres events when

they match.

CoE plays a vital role for self-management and

can be used in any solution. At the same time, it is

possible to use current collector applications, such as

those implemented in SDN controllers, with a polling

and processing strategy to power the SONAr compo-

nents.

4.4 Network Event Manager

The Network Event Manager (NEM) is a pub-

lisher/subscriber provider, able to handle events cat-

egorized into topics. Event processing can occur syn-

chronously or asynchronously, the second being pro-

vided by a message queue solution. All SONAr com-

ponents trigger events, even for notiﬁcation when per-

forming actions. Typically, SLEs and SORE are sub-

scribed into topics, but SONAr deﬁnes a ﬂexible ar-

chitecture where any Self-Organizing Entities can en-

roll on a speciﬁc topic directly and take advantage of

it. NEM also manages the event relationships gener-

ated by DSLE correlation to avoid unnecessary pro-

cessing and optimize management.

SONAr is a completely event-oriented architec-

ture, so NEM has an important role in connecting all

the other components. The asynchronicity provided

by NEM is essential for the self-management of the

network, considering the Key Performance Indicators

(KPIs) of future networks.

4.5 Network Database

The Network Database (NDB) stores all SONAr in-

formation: metrics, topology data, alarms, samples

and logs collected by CoE; administrative data used

by the dashboard; management primitives intercepted

by CPI; events sent to the NEM; events created by any

entity (SHE, SPE, SOPE, SCE etc); and prediction

and analysis information (including input parameters

and results).

Events sent to the NEM are ﬁrst used by the Self-

Organizing Entities, and stored in the NDB because

they will be used by the SLE Machine Learning al-

gorithms. From these events, the SLE can analyze

information and adjust models to predict network be-

havior. The SLE analyzes data and creates new data

(results), which are also stored in the NDB. The his-

tory of events received and created by the architecture

entities are also maintained in the NDB. The SONAr

dashboard can query this history and operators have

graphical view of the behavior of the network over

time.

The Self-Organizing Entities perform actions in

the topology, and these actions are also stored in

NDB. They can be used in the future to determine if

a speciﬁc action solves a problem or not. SLE could

also take advantage of this information. An example

Bootstrapping and Plug-and-Play Operations on Software Deﬁned Networks: A Case Study on Self-conﬁguration using the SONAr

Architecture

109

is SHE: sometimes more than one action may be re-

quired to recover from a failure. The SHE performs

the action (through the SORE) and, if the action does

not solve the problem, the next action is chosen. The

results of these actions are stored in the NDB, and

in the future the SORE queries the NDB and decides

which action, in this speciﬁc situation, has the highest

success rate.

The NDB is designed to be distributed to ensure

high availability and scalability. Implementation de-

tails are beyond the scope of this work, but it is rec-

ommended that NDB be implemented using well-

known NoSQL databases such as Cassandra, Mon-

goDB, ElasticSearch, etc. (for high availability pur-

poses).

4.6 Controller Interceptor

The Control Primitives Interceptor (CPI) is a crucial

component of SONAr. It is intended to be placed in

the Southbound Interface (SBI), working as a proxy

between the data plane and the SDN controller. The

control primitives that cross this interface are captured

and analyzed by the interceptor. If a primitive has

management semantics, it is sent to the SONAr com-

ponents and SDN controller. Otherwise, it is sent only

to the SDN controller.

The control primitives that travel in SBI are usu-

ally related to applications and hosts connected in the

topology, but some of them are related to primitives

of metrics and notiﬁcations. First, the CPI receives

a primitive and determines if this primitive has man-

agement semantics. After that, it parses the primitive

by performing a shaping and discard analysis. The

shaping will create a new primitive used by SONAr

components. The discard will drop malicious prim-

itives that were sent to the SDN controller by an at-

tacker. The new shaped primitive will be sent to Net-

work Event Manager (NEM), as an event. Then, other

SONAr components will catch the event and treat it.

4.7 Other Elements

The Network Administration Dashboard (NAD) is

an administrative tool for interacting with Self-

Management Layer, allowing: to visualize topology,

metrics, alarms, logs and events; to manage services;

to enable guided learning and manual diagnostics; to

provide service level and network status monitoring;

and to allow the deﬁnition of parameters and goals.

Auto-Boot Manager is a solution for starting the

network. It is responsible for: allocating resources;

instantiating VNFs; deployment of applications; con-

ﬁguring devices and servers; and establishment of

control and management ﬂows. It follows an imple-

mentation plan that lists infrastructure details and a

speciﬁcation of required components and where they

should be deployed, or can be fully automated by a

strategy of self-discovery and self-conﬁguration.

Even following a deployment plan, the Auto-

Boot Manager constantly monitors the infrastruc-

ture to detect new devices and changes in general,

and allows the Self-Organizing Entities to apply set-

tings as needed. It combines a DHCP server, a

FTP server containing conﬁguration scripts, an Auto-

Attach Manager as deﬁned by the IETF for SPBM,

and an OVSDB Manager. It controls the assignment

of identiﬁcation (usually IP) for devices; queries basic

information (using the CoE code); applies the SPBM

strategy to establish control and management ﬂows;

pushes conﬁgurations for devices and servers; and de-

ploys and initiates component control and manage-

ment. The Auto-Boot Manager is also directly re-

sponsible for auto-attaching of devices, servers, and

links in a plug-and-play strategy. To do this, it con-

stantly changes the basic conﬁguration of ﬂows and

deployment (in active mode) or allows the reconﬁgu-

ration called by the SCE (in reactive mode).

The Network Service Broker (NSB) is used by

customers and administrators to conﬁgure communi-

cation services based on intentions inferred from the

behavior/content or described in natural/formal lan-

guage. These services are conﬁgured by SCE and

guaranteed by other SOEs.

5 EXPERIMENTS

A prototype was developed to evaluate the SONAr

architecture, implementing its main components to

test its ability to initialize a software-deﬁned net-

work. This prototype involves: the Auto-Boot Man-

ager, which is responsible for starting and conﬁgur-

ing all infrastructure components; the Topology Col-

lecting Entity, which is responsible for running the

discovery process and for triggering events; the Self-

Conﬁguration Entity, responsible for calculating the

routes, enabling OpenFlow and conﬁguring the con-

troller in the devices, and pushing the ﬂow conﬁgura-

tion through the controller; the Network Event Man-

ager, which is responsible for managing the events ex-

changed by SCE and TCoE; the Network Database,

responsible for storing the data, and the SDN con-

troller used to control the devices.

To facilitate the deployment of the components,

they were implemented as microservices running on

Docker containers. Integration between components

is done by event and database sharing. We also im-

CLOSER 2020 - 10th International Conference on Cloud Computing and Services Science

110

plemented our own Container Infrastructure Manager

(SONAR-CIM), which provides a RESTful API for

managing containers with Docker support through the

Spotify Docker Client API 8.16.1. In addition, we

have implemented a custom DHCP server to pro-

vide dynamic IP address assignment . Our DHCP

server supports the standard method (listening for

broadcast UDP datagrams on port 67) and the Open-

Flow method (processing PACKET-IN and returning

PACKET-OUT).

All components were implemented using JAVA 8

and Spring Boot 2.1.2. We have also used RabbitMQ

3.8.0 for implementing the NEM, Apache Cassandra

3.11.4 for NDB, and ONOS 2.1.0 for SDN controller.

We have built a Docker image of NE using Linux

Alpine 3.9, net-snmp 5.7.3, lldpd 1.0.3 (with SNMP

support) and Open vSwitch 2.10.1. The boot script of

our NE starts the snmpd, lldpd and ovs-vswitchd; cre-

ates the OVSDB database and listens to remote and

local connections; creates a bridge interface and runs

the DHCP client. The SONAr server is a VirtualBox

6.0 VM with Ubuntu Server 16.04, 8194MB of RAM,

22GB of HD and a processor with 8 cores operating at

2.00GHz. In SONAr Server we installed JDK 1.8.0-

151 and also net-snmp 5.7.3 and lldpd 1.0.3 (with

SNMP support). The experiments were simulated us-

ing GNS3 2.1.16 (Neumann, 2014) in a machine with

Ubuntu 18.10, 16 GB of RAM, 40 GB of HD and a

processor with 8 cores operating at 2.00GHz.

The experiments were based on comparative tests

with topologies of different proportions. To make this

possible, we created topologies with 1, 2, 4, 8, 16, 32,

64, and 128 switches in GNS3 to simulate the scenar-

ios described in the next sections.

5.1 Network Bootstrapping Scenario

When SONAr is started, it enters in the Discovery

Stage (initiated by TCoE) in which it tries to dis-

cover all connected devices using BFS search and

multi-threaded processing. In this process, NEs

have already been started and operate as conventional

switches (with learning switch mechanism). When all

NEs are discovered with SNMP without failures or a

timeout is reached, TCoE notiﬁes the change of state

through NEM.

Upon receiving the event that indicates the end

of the Discovery Stage, the SCE starts the Routing

Stage, when the conﬁguration is calculated in a multi-

objective analysis. In this experiment, we simpliﬁed

this process using the Dijkstra algorithm to calculate

the shortest paths between the server and the NEs. In

this step, we also calculate a dependency tree, where

the subsequent node depends on the conﬁguration of

the previous node. This is required to maintain con-

nectivity to devices throughout the process. Conﬁgu-

ration is also calculated at this stage, including ﬂows

to process ARP, DHCP, and routes between switches.

Finally, after the routing stage, the SCE starts the

last stage, called the Conﬁguration Stage. In this

step, the SCE connects the grouped NEs according to

the previously calculated dependencies using a multi-

threading strategy, and conﬁgures the bridge, proto-

cols, and controllers. The SCE waits for the effective

connection of the network element before continuing.

When the NE is connected to the Controller (option-

ally using a CPI), the SCE sends the ﬂow conﬁgura-

tion to the Controller as a conﬁguration block. Upon

all devices have been conﬁgured, this stage ends and

SONAr changes the state of the network to “started”.

5.2 Device Plug-and-Play Scenario

When a new NE is connected to the network, such

as an OpenFlow switch, it usually initiates a self-

conﬁguration process in which an address is requested

through the DHCP protocol (Device Boot stage).

SONAr implements a DHCP server capable of pro-

viding unique addresses and notifying management

components such as TCoE, which starts the initial dis-

covery process to ﬁnd the attachment point of the new

element (Discovery Stage - pt.1). After that, an ac-

cess route can be calculated and deployed in order to

provide connectivity between the SONAr components

and the new device (Route Calculation and Deploy-

ment Stages). Once the access routes are deployed,

the TCoE becomes able to perform the discovery rou-

tine with the plugged device (Discovery Stage - pt.2)

and, in the end, notify the other components about the

new available device.

Upon receiving the event that indicates a new

available device, SCE starts the conﬁguration process

(Conﬁguration Stage) in which the device is conﬁg-

ured to use a speciﬁc controller (or CPI) and a default

conﬁguration is deployed according to the adminis-

trative goals deﬁned as policies.

5.3 Results

Experiments were performed comparing the SONAr

performance in topologies with different proportions.

We tested the time spent by SONAr to initialize and

extend networks with 1, 2, 4, 8, 16, 32, 64 and 128

devices. In each sub-experiment at least eight tests

were performed (one for each network), and results

with errors or abnormal delays were discarded.

In each test the conﬁguration of the switches was

reset and the components were removed from the

Bootstrapping and Plug-and-Play Operations on Software Deﬁned Networks: A Case Study on Self-conﬁguration using the SONAr

Architecture

111

server in order to test boot on networks from scratch.

As can be seen in Figure 3, we tested three dif-

ferent network bootstrapping strategies: boot 1 – con-

necting devices directly to controllers (without CPI)

and conﬁguring devices starting from the servers to

the most distant nodes (from roots to leaves); boot2

– connecting devices via CPI to controllers and con-

ﬁguring devices starting from servers to the most dis-

tant nodes (from roots to leaves); and, boot 3 – con-

necting devices via CPI to controllers and conﬁgur-

ing devices starting from the most distant nodes to the

servers (from leaves to roots). The division of this ex-

periment in three aims to identify the overhead result-

ing from the use of CPI and the differences between

the conﬁguration orders, which is due to the fact that

when conﬁguring a particular switch it is necessary to

wait for the deployment of all ﬂows before to proceed

conﬁguring the next accessible switches. This occurs

because when conﬁguring the OpenFlow protocol on

the switch bridge the default learning-switch behav-

ior is disabled preventing neighboring devices from

being accessed without speciﬁc ﬂows being created.

By starting from the “leaves” all switches can be ac-

cessed by default learning-switch behavior.

Figure 2 demonstrates that the total bootstrapping

time in all experiments increases linearly according to

the number of devices. It also shows that the overhead

introduced for using the CPI component is minimal

and does not change the linear behavior of bootstrap-

ping, which can be observed when comparing the

times between boot 2 and boot 3 strategies. The re-

sults also show that when comparing boot 1 and boot

2 strategies in networks with more than 4 devices po-

sitioned at a maximum distance of 3 hops, the order

of conﬁguration from leaves to roots presents a reduc-

tion of between 20 and 40 % of total time.

Figure 2: Time Comparison between the Three Bootstrap-

ping Strategies Tested.

Figure 3 presents the time spent in each stage of

bootstrapping using the strategy boot 3. As can be

observed, the time of all stages increases linearly ac-

cording to the number of devices. The routing time

is almost insigniﬁcant compared to the time of other

stages, and this is certainly related to the simpliﬁca-

tion of the multi-objective analysis as a shortest-path

problem and by the absence of any integration with

the external system. The Figure 3 also shows that the

growth of the discovery time through the scenarios

tends to be lighter, and this is explained due to the

multi-threaded breadth-ﬁrst search strategy and the

topology design of the scenario. Every time a device

is discovered, more devices become eligible for dis-

covery, and this causes an acceleration of the process.

The growth of conﬁguration time is more uniform and

tends to increase linearly at almost the same rate as

the number of devices. The Overhead time is basi-

cally the time difference between process startup and

termination excluding Discovery, Routing, and Con-

ﬁguration times.

Figure 3: Time Spent at Each Stage of Bootstrapping Strat-

egy 3: With Interceptor and Conﬁguring Devices from

Leaves to Root.

As can be seen in Figure 4, we tested three dif-

ferent network plug-and-play strategies: pnp 1 – con-

necting devices directly to controllers (without CPI);

pnp 2 – connecting devices via CPI to controllers; and

pnp 3 - connecting devices via CPI to controllers and

using the “delayed discovery” approach. The strategy

pnp 3 is faster because it avoids waiting for the device

to became available to discovery. This is because the

new switch needs to perform routines with the ARP

and LLDP protocols to be able to communicate and

discover neighboring switches. Performing the dis-

covery stage right after conﬁguring access ﬂows can

cause ﬁrst attempts to fail until ARP/LLDP runs. In

order to reduce this waiting time, a strategy called

“delayed discovery ” was proposed in which the new

switch is only discovered after being conﬁgured. This

strategy only works with CPI, because we need to

know the interconnection point of the new switch

without performing a discovery procedure. Figure 4

also shows that using a CPI can improve the plug-and-

CLOSER 2020 - 10th International Conference on Cloud Computing and Services Science

112

Figure 4: Time Comparison between the Three Plug-and-

Play Strategies Tested.

play speed due to the ease of knowing the attachment

point by simply checking the PACKET IN primitive

from OpenFlow protocol which is intercepted by CPI.

Figure 5 shows the time spent at each stage of the

plug-and-play with the pnp 3 strategy. Six measure-

ments were made: Switch Boot (time spent executing

the switch initialization script: starts with initializa-

tion of the container and ends with the assignment

of an address via DHCP); Waiting (indicates time

spent communicating between components, reaction

time to events, and synchronization: comprises the

difference between the total plug-and-play time and

the sum of times of the other steps); Channel Routing

(time spent calculating the best path to access the new

switch); Channel Conﬁguration (time taken to apply

the new ﬂow rules to the topology); Discovery (time

taken to discover and validate the new switch); and,

Conﬁguration (time taken to conﬁgure the new switch

via OVSDB and deploy streams via SDN Controller).

As shown in the Figure 5, the time used with the pnp 3

strategy in all topologies tested showed low variation,

what indicates that the time for plug-and-play devices

with SONAr using this strategy tends to be constant.

The experiments presented in this section show

that SONAr provides a solution capable of automating

Figure 5: Time Spent at Each Stage of Plug-and-Play Strat-

egy 3: Using CPI and “delayed Discovery”.

the bootstrapping and plug-and-play of a self-deﬁned

network and demonstrated the viability and validity

of the proposal of this work. The results also show

the general behavior of the time spent at each stage,

which is useful in network planning and also proves

the polynomial complexity of the solution.

6 CONCLUSION

This paper presented a conceptual solution to en-

able self-organizing networks focusing on self-

conﬁguration property operations. Initially, SONAr

was presented with details about its components,

layers and design. Then, a case study with self-

conﬁguration was conducted to verify the viability

and applicability of SONAr solution to address some

fundamental aspects of self-management: bootstrap-

ping and plug-and-play. The results were satisfac-

tory and demonstrated that SONAr can be applied to

reduce operating costs related to self-conﬁguration.

We also conducted a review of the requirements of

telecommunications companies and their administra-

tors/network operators. These requirements allowed

the speciﬁcation of SONAr presented in this paper.

Despite the gains in terms of OPEX, since SONAr

will automatically perform the conﬁgurations of the

network, the ﬂexibility and the speed – with which

the infrastructure will respond to incidents – is essen-

tial to highlight the improvement in network security,

since the equipment no longer being accessed by hu-

mans for the purpose of (re)conﬁgurations.

SONAr solution is actually completely speciﬁed

in our research group and, for sake of space, a sum-

mary was presented in this document. It is impor-

tant to note that this project is being developed from

experiences with a local telecommunication operator.

Most components showed in Figure 1 are already ﬁn-

ished and some components are being developed at

the present moment. The experiments presented in

this paper show the booting and conﬁguration sce-

narios, which are the ﬁrst steps to pass the computer

network from the non-operational to the operational

state. The experiments related to healing, optimizing,

protection and others are not presented here due to

space limitations.

The future works are to: (i) ﬁnalize the develop-

ment of all SONAr components, using programming

languages and frameworks widely used in industry;

(ii) list many use cases to evaluate all aspects of a

real telecommunications company; and (iii) compare

SONAr with other projects, integrating it where pos-

sible.

Bootstrapping and Plug-and-Play Operations on Software Deﬁned Networks: A Case Study on Self-conﬁguration using the SONAr

Architecture

113

ACKNOWLEDGEMENTS

This study was ﬁnanced in part by the Coordenac¸

de Aperfeic¸oamento de Pessoal de N

ıvel Superior –

Brasil (CAPES) – Finance Code 001. This research

also received the support from PROPP/UFU.

REFERENCES

3GPP (2018a). Telecommunication management; Self-

conﬁguration of network elements; Concepts and re-

quirements. Technical Speciﬁcation (TS) 32.501, 3rd

Generation Partnership Project (3GPP).

3GPP (2018b). Telecommunication management; Self-

Organizing Networks (SON); Concepts and require-

ments. Technical Speciﬁcation (TS) 32.500, 3rd Gen-

eration Partnership Project (3GPP).

3GPP (2018). Telecommunication management; Self-

Organizing Networks (SON); Concepts and require-

ments. Technical Speciﬁcation (TS) 32.500, 3rd Gen-

eration Partnership Project (3GPP).

3GPP (2018a). Telecommunication management; Self-

Organizing Networks (SON) Policy Network Re-

source Model (NRM) Integration Reference Point

(IRP); Requirements. Technical Speciﬁcation (TS)

32.521, 3rd Generation Partnership Project (3GPP).

3GPP (2018b). Telecommunication management; Self-

Organizing Networks (SON); Self-healing concepts

and requirements. Technical Speciﬁcation (TS)

32.541, 3rd Generation Partnership Project (3GPP).

Abdallah, S., Elhajj, I. H., Chehab, A., and Kayssi, A.

(2018). A network management framework for sdn. In

2018 9th IFIP International Conference on New Tech-

nologies, Mobility and Security (NTMS), pages 1–4.

Bainbridge, L. (1983). Ironies of automation. Automatica,

19(6):775–779.

Barona L

opez, L., Valdivieso Caraguay,

A., Sotelo Monge,

M., and Garc

ıa Villalba, L. (2017). Key technolo-

gies in the context of future networks: operational and

management requirements. Future Internet, 9(1):1.

Bastug, E., Bennis, M., Medard, M., and Debbah, M.

(2017). Toward Interconnected Virtual Reality: Op-

portunities, Challenges, and Enablers. IEEE Commu-

nications Magazine, 55(6):110–117.

Boutaba, R. and Aib, I. (2007). Policy-based Management:

A Historical Perspective. Journal of Network and Sys-

tems Management, 15(4):447–480.

Cox, J. H., Chung, J., Donovan, S., Ivey, J., Clark, R. J., Ri-

ley, G., and Owen, H. L. (2017). Advancing software-

deﬁned networks: A survey. IEEE Access, 5:25487–

25526.

Cui, C., Deng, H., Telekom, D., Michel, U., Damker, H.,

Italia, T., Guardini, I., Demaria, E., Minerva, R., and

Manzalini, A. (2012). Network Functions Virtualisa-

tion. SDN and OpenFlow World Congress.

d. R. Fonseca, P. C. and Mota, E. S. (2017). A sur-

vey on fault management in software-deﬁned net-

works. IEEE Communications Surveys Tutorials,

19(4):2284–2321.

Duez, P. P., Zuliani, M. J., and Jamieson, G. A. (2006).

Trust by Design: Information Requirements for Ap-

propriate Trust in Automation. In Proceedings of the

2006 Conference of the Center for Advanced Studies

on Collaborative Research, CASCON ’06, Riverton,

NJ, USA. IBM Corp.

Feng, S. and Seidel, E. (2008). Self-organizing networks

(SON) in 3gpp long term evolution. Nomor Research

GmbH, Munich, Germany, pages 1–15.

Ganek, A. G. and Corbi, T. A. (2003a). The dawning of

the autonomic computing era. IBM systems Journal,

42(1):5–18.

Ganek, A. G. and Corbi, T. A. (2003b). The dawning of

the autonomic computing era. IBM systems Journal,

42(1):5–18.

Hariri, S., Guangzhi Qu, Modukuri, R., Huoping Chen,

and Yousif, M. (2005). Quality-of-protection (QoP)-

an online monitoring and self-protection mechanism.

IEEE Journal on Selected Areas in Communications,

23(10):1983–1993.

McKeown, N., Anderson, T., Balakrishnan, H., Parulkar,

G., Peterson, L., Rexford, J., Shenker, S., and Turner,

J. (2008). OpenFlow: Enabling Innovation in Cam-

pus Networks. SIGCOMM Comput. Commun. Rev.,

38(2):69–74.

Moysen, J. and Giupponi, L. (2018). From 4g to 5g: Self-

organized network management meets machine learn-

ing. Computer Communications, 129:248–268.

Neumann, J. C. (2014). The Book of GNS3. No Starch

Press, San Francisco, CA, USA, 1st edition.

Neves, P., Cal

e, R., Costa, M. R., Parada, C., Parreira,

B., Alcaraz-Calero, J., Wang, Q., Nightingale, J.,

Chirivella-Perez, E., Jiang, W., et al. (2016). The self-

net approach for autonomic management in an nfv/sdn

networking paradigm. International Journal of Dis-

tributed Sensor Networks, 12(2):2897479.

Ramirez-Perez, C. and Ramos, V. (2016). Sdn meets sdr

in self-organizing networks: ﬁtting the pieces of net-

work management. IEEE Communications Magazine,

54(1):48–57.

Sanchez, J., Yahia, I. G. B., Crespi, N., Rasheed, T., and Sir-

acusa, D. (2014). Softwarized 5g networks resiliency

with self-healing. In 1st International Conference on

5G for Ubiquitous Connectivity, pages 229–233.

Simsek, M., Aijaz, A., Dohler, M., Sachs, J., and Fettweis,

G. (2016). 5g-Enabled Tactile Internet. IEEE Jour-

nal on Selected Areas in Communications, 34(3):460–

473.

Wickboldt, J. A., Jesus, W. P. D., Isolani, P. H., Both,

C. B., Rochol, J., and Granville, L. Z. (2015).

Software-deﬁned networking: management require-

ments and challenges. IEEE Communications Mag-

azine, 53(1):278–285.

Zaidi, Z., Friderikos, V., Yousaf, Z., Fletcher, S., Dohler,

M., and Aghvami, H. (2018). Will SDN Be Part

of 5g? IEEE Communications Surveys & Tutorials,

20(4):3220–3258.

CLOSER 2020 - 10th International Conference on Cloud Computing and Services Science

114