Automatic Refactoring of Component-based Software by Detecting and

Eliminating Bad Smells

A Search-based Approach

Salim Kebir

1,3

, Isabelle Borne

and Djamel Meslati

Ecole Nationale Superieure d’Informatique, BP 68 M Oued-Smar, Algiers, Algeria

IRISA, Universit´e de Bretagne-Sud, Vannes, France

Laboratoire d’Ing´enierie des Syst`emes Complexes (LISCO), Universit´e Badji Mokhtar, Annaba, Algeria

Keywords:

Automatic Refactoring, Search-based Software Engineering, Component-based Software Engineering,

Genetic Algorithm, Bad Smells.

Abstract:

Refactoring has been proposed as a de facto behavior-preserving mean to eliminate bad smells. However

manually determining and performing useful refactorings is a though challenge because seemingly useful

refactorings can improve some aspect of a software while making another aspect worse. Therefore it has been

proposed to view object-oriented automated refactoring as a search-based technique. Nevertheless the review

of the literature shows that automated refactoring of component-based software has not been investigated yet.

Recently a catalogue of component-relevant bad smells has been proposed in the literature but there is a lack of

component-relevant refactorings. In this paper we propose detection rules for component-relevant bad smells

as well as a catalogue of component-relevant refactorings. Then we rely on these two elements to propose

a search-based approach for automated refactoring of component-based software systems by detecting and

eliminating bad smells. Finally, we experiment our approach on a medium-sized component-based software

and we assess the efﬁcieny and accuracy of our approach.

1 INTRODUCTION

Due to organizational and market pressures it is not

conceivable to develop a software by keeping per-

manently in mind the idea that it should be eas-

ily maintained or changed to fulﬁll new require-

ments, as it forces programmers to focus on an ex-

tra time-consuming task. This translates into the

emergence of bad smells (Fowler et al., 1999), also

called design defects or code anomalies. As a con-

sequence, software becomes hard and too costly to

maintain. In order to overcomethisproblemin object-

oriented software systems, refactoring has been pro-

posed to provide behavior-preserving means to elimi-

nate bad smells and improve the design of a software

(Fowler et al., 1999). However, manually determining

and performing useful refactorings is a tough chal-

lenge (Seng et al., 2006). In order to address it, it

has been proposed to view automated refactoring of

object-oriented software as a search-problem where

an automated system can discover useful refactorings

(O’Keeffe and Cinn´eide, 2006). This can be achieved

by searching for a sequence of usefull refactorings

that improve the overall quality of the system.

The review of the literature shows that automated

refactoring of component-basedsoftware has not been

investigated yet. Recently a catalogue of component-

relevant bad smells has been proposed by Garcia et.

al. (Garcia et al., 2009) and extended by Macia et. al.

(Macia et al., 2013) but there is a lack of component-

relevant refactoring operations to overcome these bad

smells. Thus refactoring has to be rethought to take

into account the different structural aspects that com-

ponents and interfaces exhibit.

Our contribution in this paper is twofold : ﬁrst, we

propose detection rules for component-relevant bad

smells as well as a catalogue of component-relevant

refactoring to get rid of them. Second, we rely on

these two elements to propose a search-based ap-

proach for automated refactoring of component-based

systems.

This paper is organized as follows : In Section 2,

we present a detailed description of the problem. Sec-

tion 3 describes our approach with focus on the bad

smells detection rules, the proposed refactorings and

the genetic algorithm we use. Section 4 containsa dis-

210

Kebir, S., Borne, I. and Meslati, D.

Automatic Refactoring of Component-based Software by Detecting and Eliminating Bad Smells - A Search-based Approach.

In Proceedings of the 11th International Conference on Evaluation of Novel Software Approaches to Software Engineering (ENASE 2016), pages 210-215

ISBN: 978-989-758-189-2

cussion of the experimental study that we performed.

Finally, Section 5 concludes the paper and presents

future perspectives.

2 PROBLEM DESCRIPTION

We address the automated refactoring of component-

based software. In concrete, the solution to this prob-

lem consists in the detection and elimination of bad

smells by operating refactoring operations at the com-

ponent level.The entries of this problem are : the

source code, a set of component-relevant bad smells

and a set of component-relevant refactorings. In the

following, we give more details on these three ele-

ments.

2.1 The Source Code

The source code of a software is the most reliable and

accurate source of information describing this latter.

However in the context of automated refactoring, the

source code in its textual form can not be considered

as such because it requires highly expensive parsing

operations which degrades the overall process perfor-

mances. In order to avoid theses costs, source code

must be ﬁrst reiﬁed in an intermediate structure called

the source code model. Such a structure must be de-

signed to allow to measure some properties that we

need later during the extraction of bad smells detec-

tion rules. It must also be suitable to simulate actual

refactoring and check if they do not lead to incoherent

situations.

2.2 Component-relevant Bad Smells

Recently, Garcia et. al. (Garcia et al., 2009)

identiﬁed four representative component-relevant bad

smells that they encountered in the context of reverse-

engineering and refactoring of large industrial sys-

tems. In order to detect such smells, they provide

architects with UML diagrams and concrete textual

deﬁnitions of each bad smell. More recently, in the

same perspective, Macia et. al. (Macia et al., 2013)

extended this catalogue.

2.3 Component-relevant Refactorings

In general, refactorings are often associated with a

set of bad smells (Fowler et al., 1999) by anal-

ogy to medical diagnostic-treatments. Nevertheless,

in the context of component-based software, object-

oriented refactoring seem not to be adequate to refac-

tor component-based software due to the additional

level of abstraction introduced by components and in-

terfaces.

3 SOLUTION APPROACH

In our approach, automated refactoring is imple-

mented using a search-based technique. We decom-

pose our approach in three steps: (i) extraction of rel-

evant information from source code to construct the

source code model, (ii) formulation of a detection

rule and a refactoring for each component-relevant

bad smell and ﬁnally (iii) exploration of the solutions

space using a genetic algorithm. Figure 1 depicts

these three steps. Next, we will see in detail each step.

Figure 1: Overall view of our approach.

3.1 Facts Extraction

During this step, we construct from source code and

additional artifacts (e.g. XML Conﬁguration ﬁles) the

source code model in accordance with the metamodel

established in ﬁgure 2.

Figure 2: Source Code Metamodel.

We constructed this metamodel according to a

recent survey conducted by Vale et al. (Vale et al.,

2016). In this surveys, it is stated that components

are often considered as sets of classes and interfaces

are those classes which have a link with some classes

Automatic Refactoring of Component-based Software by Detecting and Eliminating Bad Smells - A Search-based Approach

211

from the outside of the component (e.g. a method

call or attribute use from the outside).

In order to perform the extraction of these infor-

mation, we have designed and implementedan extrac-

tion engine that rely on the API provided by Eclipse

JDT

. The extraction engine depends on the compo-

nent model, the underlying programming language

and additional component-model speciﬁc resources

like XML and Manifest conﬁguration ﬁles. At this

moment, we have successfully deﬁned and imple-

mented an extraction engine for OSGi component-

based applications.

3.2 Formulation of Detection Rules and

Associated Refactorings

In this section we will revisit component-relevantbad

smells. Moreover, we propose to detect each bad

smell by reﬁning its description into an informal rule,

and then extract from these rules measurable prop-

erties whose range ∈ [0,1] and pertain to internal at-

tributes and metrics of the constituents of source code

metamodel.

Furthermore, for each bad smell we propose a

refactoring to eliminate it. Similarly to Fowler’s ap-

proach (Fowler et al., 1999) we describe signiﬁcant

properties of each refactoring using the following

template.

Table 1: Refactoring template.

The context summarizes the situation in

which the refactoring is needed. That is,

it explain when performing the refactoring.

The summary of the refactoring must re-

ﬂect in a concise manner what action is

performed by the refactoring and where it

have to be performed. The mechanics de-

scribes how to perform the refactoring.

3.2.1 Ambiguous Interface

Deﬁnition. Components suffering from this bad

smell offer only a single, general entry-point. Such

interface are referred to as ambiguous (Garcia et al.,

2009). Moreover it dispatches requests to internal

services not belonging to any interface (Garcia et al.,

2009).

Detection Rule. According to the previous deﬁni-

tion, to judge whether a component suffers from am-

biguous interface, we need to know the number of its

interfaces and the number of their services. The lower

Eclipse JDT. http://eclipse.org/jdt/

are these two numbers are low, the more the compo-

nent has ambiguous interfaces. Therefore these infor-

mation alone are not sufﬁcient to assess how much

the interface is ambiguous. Indeed, we also need to

know about how much each interface dispatches re-

quests to other internal services not belonging to any

interface. Thus, we deﬁne the following rule to assess

how much a component suffers from this bad smell :

AI(C) =

|C.p|

∑

i∈C.p

|SOS(i)|

j∈C\C.p

∑

i∈C.p,k∈C

|SOC(i, j)|

|SOC(i,k)|

(1)

where :

• C.p denotes the set of provided interfaces of the

componentC.

• SOS(i) denotes the set of services belonging to the

interface i.

• SOC(i,c) denotes the set of outgoing calls from

the services belonging to an interface i to public

methods belonging to the class c.

Proposed Refactoring: Pull Interface. In a com-

ponentsufferingfrom ambiguousinterface, there may

be classes that offer services but are not deﬁned as

provided interfaces. This reduces analyzability and

understandability since a user must look into the im-

plementation of the component to know about the ser-

vices it offers. The proposed refactoring namely Pull

Interface consists in creating a new provided inter-

face for a component using the underlying component

model mechanisms to turn such classes into provided

interfaces.

3.2.2 Connector Envy

Deﬁnition. Components with Connector Envy en-

compass extensive interaction-related functionality

between two or more other components (Garcia et al.,

2009).

Detection Rule. According to the previous deﬁni-

tion, a component suffering from this bad smell dele-

gates the majority of its requests to other components.

Thereby, the number of its incoming and outgoing

calls should be relatively high. Consequently we de-

ﬁne the following rule to assess how much a compo-

nent suffers from connector envy :

CE(C) =

j/∈C

∑

i∈C

(|SOC(i, j) ∪ SOC( j,i)|)

∀k

∑

i∈C

(|SOC(i,k) ∪ SOC(k,i)|)

(2)

ENASE 2016 - 11th International Conference on Evaluation of Novel Software Approaches to Software Engineering

212

Proposed Refactoring: Push Component. A

component with connector envy only delegates calls

from a component to another and does not have a

proper responsibility. Thus it should be integrated to

one or the other. This bad smell reduces reusability

insofar the component can not be reused elsewhere.

The proposed refactoring namely Push Component

consists in integrating a component into another by

moving all the classes belonging to a component into

another one and deleting the old component. By ap-

plying this refactoring, components with connector

envy are eliminated. Thus, the lack of reusability is

not relevant anymore.

3.2.3 Scattered Parasitic Functionality

Deﬁnition. This bad smell occurs in a system where

multiple components are responsible for realizing the

same high-level concern and, additionally, some of

these components are individually responsible for an

additional unrelated concern (Garcia et al., 2009).

Detection Rule. Given a set of components, in or-

der to detect this bad smell, we need to measure the

overall cohesion of this set of components and the in-

dividual cohesion of each component. In one hand,

the higheris the overall cohesion, the more a function-

ality is scattered among this set of components. In the

other hand, the higher is the cohesion of each compo-

nent, the less this set of components suffer from scat-

tered parasitic functionality. So in order to detect this

bad smell, we propose the following rule to assess if

a set of components S = {C

,...,C

} suffer from

scattered parasitic functionality :

SPF(S) =

· (LCC(S) +

∑

∈S

1− LCC(C

)

|S|

) (3)

where :

• LCC(c) denotes the cohesion of the classes be-

longing to the componentc accordingto the Loose

Class Cohesion metric proposed in (Bieman and

Kang, 1995).

Proposed Refactoring: Merge Components. In a

system suffering from Scattered Parasitic Functional-

ity, several components may be individually responsi-

ble for implementing a wide scope concern. The latter

should be encompassed in a single component. This

violates the separation of concerns principle since a

concern is scattered among a set of elements. The

proposedrefactoring namely Merge Components con-

sists in merging two or more components into a new

one by creating a new component containing all the

classes belonging to several components and deleting

the old ones.

3.2.4 Component Concern Overload

Deﬁnition. Components with concern overload are

responsible for realizing two or more unrelated archi-

tectural concerns (Garcia et al., 2009).

Detection Rule. This bad smell can be easily de-

tected by measuring the cohesion of the component.

The lower this measure, the more the component is

suffering from concern overload. So, we propose this

rule to assess how much a component is overloaded

with many concerns :

CCO(C) = 1− LCC(C) (4)

Proposed Refactoring: Extract Component. In

a single component suffering from Component Con-

cern Overload, the separation of concerns principle

is violated since an element is responsible of two

or more concerns. The refactoring proposed here

namely Extract Component consists in extracting a

new component from an existing one by creating a

new component containing a subset of classes from

the set of classes belonging to a given component.

3.2.5 Overused Interface

Deﬁnition. Also called Fat Interfaces (Romano

et al., 2014), these are interfaces whose clients invoke

different subsets of their services (Macia et al., 2013).

Detection Rule. This bad smell can be detected by

measuring for each client of a given interface, the

number of services invoked together. The higher is

this number, the less the interface is overused. Thus

we propose in a similar manner to (Romano et al.,

2014) to detect this bad smell by measuring the aver-

age of the ratio of services invoked from all the clients

of a given interface using the following rule :

OI(i) =

|CLIENT S(i)|

∑

∈CLIENT S(i)

|SOC(C

,i)|

|SOS(i)|

(5)

where :

• CLIENTS(i) denotes the set of clients using the

interface i.

Proposed Refactoring: Extract Interface. An in-

terface suffering from Interface Overload may be

caused by a God Class (Fowler et al., 1999). The

refactoring proposed here namely Extract Interface

Automatic Refactoring of Component-based Software by Detecting and Eliminating Bad Smells - A Search-based Approach

213

consists in extracting a new interface from an existing

one by creating a new interface containing a subset of

methods from the set of methods belonging to a given

interface.

3.3 Genetic Algorithm

Basically Search-Based approaches rely on three key

ingredients (Bavota et al., 2014) : (i) an individual

representation used to encode a solution to the prob-

lem; (ii) a ﬁtness function which is a mean to as-

sess the quality of a given individual; and (iii) change

operators which are used to produce new neighbor-

hood solutions starting from existing ones. In or-

der to implement a genetic algorithm (GA) for auto-

mated refactoring of component-based software, we

describe in the following each of the three above-

mentioned elements and how they are articulated

within the genetic algorithm.

3.3.1 Individuals Representation

In our approach, individuals are composed of two el-

ements (Figure 3):

• The genotype which is an orderedvariable-length

sequence of refactorings including necessary pa-

rameters. When the sequence of refactorings is

executed, it performs these changes and produces

a modiﬁed version of the source code model.

• The phenotype which is the obtained source code

model after performing the sequence of refactor-

ings to the initial source code model in the order

that is given in the genotype.

Figure 3: Individual Representation.

Our use of a source code model as a phenotype en-

ables efﬁcient computation of bad smells detection

rules.

3.3.2 Fitness Function

In our approach, the ﬁtness function is the sum of the

ﬁve above-deﬁned rules used to detect bad smells in

all components and interfaces of the application. The

ﬁtness function is evaluated on an individual by (i)

running the sequence of refactoring operations con-

tained in its genotype and (ii) evaluating the detection

rules on the resulting source code model contained in

its phenotype.

3.3.3 Change Operators

In each iteration, the GA starts by (i) selecting chro-

mosomes that will form a mating pool for crossover

and mutation using the roulette wheel selection. This

selection is based on the ﬁtness value of individu-

als. Then (ii) the offspring is generated by apply-

ing one-point crossover on each pair to generate two

new chromosomes. After that, (iii) mutation is ap-

plied to each chromosome in the current population

with a user-deﬁned probability. It either replaces a

randomly chosen refactoring operation by a new one

or randomly inserts/deletes a new refactoring opera-

tion to the genotype. The process continues until the

choosen number of generations is reached.

4 CASE STUDY

We have experimented our approach on Eclipse MAT

(Memory Analyzed Tool)

which is an OSGi stan-

dalone application that supports programmers to de-

tect memory leaks. Eclipse MAT contains 12 com-

ponents (OSGi Bundles). Figure 4 depicts the depen-

dencies between Eclipse MAT components with fo-

cus on ones severly affected by bad smells (colored in

grey).

Figure 4: Dependency diagram of Eclipse MAT.

4.1 Approach Efﬁciency

In our genetic algorithm, we used 1000 generations

for a population size of 20. The results of our evalua-

tion are summarized in Figure 5.

We notice that our approach improves pretty

good the value of the ﬁtness function. Indeed we

have found that the value of the ﬁtness function of

the best proposed solution was 3.86. This indi-

cates that 4.41(8.27 − 3.86) of bad smells have been

ﬁxed which gives an acceptable efﬁciency value of

53%(4.41/8.27).

Eclipse Memory Analyzer Tool : www.eclipse.org/mat

ENASE 2016 - 11th International Conference on Evaluation of Novel Software Approaches to Software Engineering

214

Figure 5: Fitness value evolution (Lower values are better).

4.2 Approach Accuracy

We manually investigated the obtained design to

judge if the proposed refactorings are accurate and we

have found that the best solution produced by the GA

contains 9 components (Figure 6).

Figure 6: System design after applying refactorings.

We notice that in the original design 6 components

were severly suffering from bad smells (colored in

grey in Fig. 4). However, 8 components have been

refactored into 5 new ones (colored in blue in Fig. 6)

and the 4 remaining components stayed untouched.

Among these 8 components only 2 ones were not af-

fected by bad smells. This gives us a false positives

value of 16.66%(2/12). This low value indicates that

our approach is very accurate on detecting and cor-

recting bad smells.

5 CONCLUSION

In this paper, we have addressed automated refactor-

ing of component-based software systems. To tackle

this problem, we have proposed detection rules for the

recently proposed component-relevant bad smells as

well as a catalogue of component-relevant refactor-

ings. Then, we relied on these two elements to pro-

pose a genetic algorithm to ﬁnd the best sequence of

refactorings to perform. We have experimented our

approach on a medium-sized software and evaluated

it in terms of efﬁciency and accuracy.

To the best of our knowledge, our approach is the

ﬁrst attempt to automated refactoring of component-

based applications. We believe that we can further im-

prove it in the future. In the short term, we plan to ex-

tend our extraction engine to support more component

models. In the long term, we plan to use component-

relevant quality metrics to improve the exploration of

the solution space.

REFERENCES

Bavota, G., Di Penta, M., and Oliveto, R. (2014). Search

based software maintenance: Methods and tools. In

Evolving Software Systems, pages 103–137. Springer.

Bieman, J. M. and Kang, B.-K. (1995). Cohesion and reuse

in an object-oriented system. In ACM SIGSOFT Soft-

ware Engineering Notes, volume 20, pages 259–262.

ACM.

Fowler, M., Beck, K., Brant, J., Opdyke, W., and Roberts,

D. (1999). Refactoring: Improving the design of ex-

isting programs.

Garcia, J., Popescu, D., Edwards, G., and Medvidovic,

N. (2009). Toward a catalogue of architectural bad

smells. In Architectures for adaptive software systems,

pages 146–162. Springer.

Macia, I., Garcia, A., Chavez, C., and von Staa, A.

(2013). Enhancing the detection of code anoma-

lies with architecture-sensitive strategies. In Software

Maintenance and Reengineering (CSMR), 2013 17th

European Conference on, pages 177–186. IEEE.

O’Keeffe, M. and Cinn´eide, M. O. (2006). Search-based

software maintenance. In Software Maintenance and

Reengineering, 2006. CSMR 2006. Proceedings of the

10th European Conference on, pages 10–pp. IEEE.

Romano, D., Raemaekers, S., and Pinzger, M. (2014).

Refactoring fat interfaces using a genetic algorithm. In

Software Maintenance and Evolution (ICSME), 2014

IEEE International Conference on, pages 351–360.

IEEE.

Seng, O., Stammel, J., and Burkhart, D. (2006). Search-

based determination of refactorings for improving the

class structure of object-oriented systems. In Proceed-

ings of the 8th annual conference on Genetic and evo-

lutionary computation, pages 1909–1916. ACM.

Vale, T., Crnkovic, I., de Almeida, E. S., Neto, P. A. d.

M. S., Cavalcanti, Y. C., and de Lemos Meira, S. R.

(2016). Twenty-eight years of component-based soft-

ware engineering. Journal of Systems and Software,

111:128–148.

Automatic Refactoring of Component-based Software by Detecting and Eliminating Bad Smells - A Search-based Approach

215