A Generic Approach for the Identiﬁcation of Variability

Anilloy Frank and Eugen Brenner

Institute of Technical Informatics, Technische Universit

at, Inffeldgasse 16, 8010 Graz, Austria

Keywords:

Design Tools, Embedded Systems, Feature Extraction, Software Reusability, Variability Management.

Abstract:

The automotive electrical/electronics (E/E) embedded software development largely uses Model Based Soft-

ware Engineering (MBSE), an industrially accepted approach. With an ever increasing complexity of embed-

ded software, the E/E models in automotive applications are getting enormously unmanageable. The heteroge-

neous nature of projects developed using several modeling and simulation tools, and the hierarchical structure

with numerous composite components deeply embedded within, tends to repeatability. Hence it is often nec-

essary to deﬁne a mechanism to identify reusable components from these that are embedded deep within. The

proposed approach addresses the identiﬁcation process in the development and deployment of software com-

ponents used in the realization of such distributed processes, by selectively targeting the component-feature

model (CF) instead of a comprehensive search to improve the identiﬁcation. It addresses the issues to identify

commonality of variants within a product development. The results obtained are faster and are more accurate

compared to other methods.

1 INTRODUCTION

The current development trend in automotive soft-

ware is to map embedded software components on

networked Electronic Control Units (ECU) (Kum

et al., 2008).

Variants of embedded software functions are in-

evitable in customizing for different regions (Europe,

Asia, etc.), to meet regulations of the respective re-

gions. Also different sensors / actuators, different de-

vice drivers, and distribution of functionality on dif-

ferent ECUs necessitate variants (Frank and Brenner,

2010a); (Frank and Brenner, 2010b).

Often it is apparent to procure well established

software components tested for performance, safety

and reliability from external sources or Original

Equipment Manufacturers (OEM), illustrated in Fig-

ure 1. The black box characteristics of such software

components, when integrated in models, further add

to the complexity, and work as hindrance in manag-

ing variability.

Managing variability involves extremely complex

and challenging tasks, which must be supported by ef-

fective methods, techniques, and tools (Clements and

Northrop, 2007). In view of this complexity, achiev-

ing the required reliability and performance is one of

the most challenging problems (Bosch, 2000).

The proposed strategy is a model-based approach

for the distributed business process. The approach

intends to facilitate automated and interactive strate-

gies to addresses the identiﬁcation process in the de-

velopment and deployment of software components.

We start by analyzing the textual representation of the

model structure and form a concept to extract an el-

ement list to facilitate the identiﬁcation of variabil-

ity. Based on the adaptation of a formal mathematical

model presented in this paper is the implementation

and evaluation of the proposed strategy.

2 RELATED WORK

For achieving large-scale software reuse, reliability,

performance and rapid development of new products,

Software Product-Line Engineering(SPLE) is an ef-

fective strategy. SPLE can be categorized into domain

engineering and application engineering (Bachmann

and Clements, 2005); (Bosch, 2000). Domain en-

gineering involves design, analysis and implementa-

tion of core objects, whereas application engineering

is reusing these objects for product development.

Model Driven Software Development (MDSD) is

typically realized in a distributed system environment

for the development of automotive applications and

products (Kulesza et al., 2007). Model-based tech-

niques are used to support the usage of platform inde-

167

Frank A. and Brenner E..

A Generic Approach for the Identiﬁcation of Variability.

DOI: 10.5220/0004009301670172

In Proceedings of the 7th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE-2012), pages 167-172

ISBN: 978-989-8565-13-6

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

pendent code. The abstract speciﬁcation of the com-

ponents is done by domain experts, and the task for

deploying these components on different platforms is

handled separately by speciﬁc platform developers.

As a consequence the effort required for porting el-

ements is reduced (Gomaa and Webber, 2004).

Figure 1: External components as a hindrance to variability

management.

The Software Product-Line (SPL) approach pro-

motes the generation of speciﬁc products from a set

of core assets, domains in which products have well

deﬁned commonalities and variation points (Oliveira

et al., 2005).

One of the fundamental activity in SPLE is Vari-

ability management (VM). Throughout the SPL life

cycle VM explicitly represents variations of soft-

ware artifacts, managing dependencies among vari-

ants and supporting their instantiations (Clements and

Northrop, 2007).

Activities on the variant management process in-

volves variability identiﬁcation, variability speciﬁca-

tion and variability realization.

• The Variability Identiﬁcation Process will incor-

porate feature extraction and feature modeling.

• The Variability Speciﬁcation Process is to derive

a pattern.

• The Variability Realization Process is a mecha-

nism to allow variability.

One of the basic element in these approaches is

a software component, which is an execution unit

with well deﬁned interfaces (Szyperski, 2002). The

usage of software components is driven by the re-

quirements of improving the reusability of developed

software artifacts. Mapping of software components

on networked ECU is a distinct shift from Compo-

nent Based Software Engineering (CBSE). Software

components are combined with the help of assembly

descriptions. They are speciﬁed in the development

phase and are resolved in the deployment phase of a

CBSE process (Crnkovic, 2005).

Despite of all the hype there is a lack of an overall

reasoning about variability management.

Although variability management is recognized as

an important issue for the success of SPLs, there are

not many solutions available (Heymans and Trigaux,

2003). However, there are currently no commonly

accepted approaches that deal with variability holis-

tically at architectural level (Galster and Avgeriou,

2011).

3 PROPOSED APPROACH

Models conﬁrming to numerous tools like

ESCAPE



, EAST-ADL



, UML



tools, SysML



speciﬁcations, and AUTOSAR



were considered,

although this concept is not limited to the automotive

domain alone.

3.1 Problem Analysis

• Textual Representation: An analysis of the mod-

els exhibits a common architecture. Figure 2 de-

picts the textual representation that underlies the

graphical model. The textual representation usu-

ally is given in XML, which strictly validates to a

schema.

Figure 2: Mapping textual and graphical representations.

ENASE2012-7thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering

168

The schema deﬁnes elements transformed into an

explicit mapping that specify integrity constraints

modeled as real world entities in the project.

• Signiﬁcant Nodes: Examination of the nodes in

the textual representation of models depicted in

Figure 3 reveals some interesting information.

The nodes outlined in rectangles provide impor-

tant information regarding the identity, speciﬁca-

tion, physical attributes, etc. of a component, but

are insigniﬁcant from the perspective of variant.

The CF model is derived manually from the set of

elements in the schema that signify components

are clustered to obtain a component list; and ele-

ments within these which characterize features as

a feature vector.

• Heterogeneous Modeling Environment: A hetero-

geneous modeling environment may consist of

numerous design tools, each with its own unique

schemata, to offer integrity and avoid inconsisten-

cies. Developed projects have to be strictly vali-

dated to the schemata of these tools.

Figure 3: XML Nodes that are not signiﬁcant for variability.

3.2 Concept and Approach

The work ﬂow of the concept is depicted in Figure 4.

The top layer here represents the domain or core

assets. Sets of projects conﬁrming to respective

schemata of several modeling tools are depicted.

Models are hugely hierarchical in nature with numer-

ous composite components deeply embedded within

projects.

The middle layer is a semi-automatic variability

identiﬁcation layer, subdivided into two parts. The

left part depicts sets of distinct component lists and

corresponding feature vectors derived manually from

the schemata for each modeling tool; a collection of

elements that represent components and their descrip-

tive features that signiﬁcantly contribute to the identi-

ﬁcation of the component’s variant. To assist the se-

lection the right part is a customized parser that gen-

erates a relevant lexicon from the set of software com-

ponents within a project and set of rules (viz., manda-

tory, optional, exclude) to govern the identiﬁcation of

variability.

The lower layer is an application layer where the

application developer provides the speciﬁcation set

and based on the rules the result set is returned.

Figure 4: Work ﬂow for semi-automated identiﬁcation of

variants.

Algorithm:

1. Obtain a subset of nodes from within the schema

that signiﬁes importance and description of the

whole, or part components to a component list.

2. Components themselves may further be com-

prised of sub nodes (components and features).

Not all sub nodes of the components in the com-

ponent list may be essential to describe variability.

3. Therefore for each element within the component

list further obtain a subset of the sub nodes from

the schema, which describes features of the com-

ponents to a feature vector.

4. Using the component list and the feature vector

generate a dictionary of keywords from within the

project, along with the frequency to determine the

weight or signiﬁcance of the keywords.

5. Apply rules (like contains all, one or more, and

does not contain) to search the speciﬁcation set to

AGenericApproachfortheIdentificationofVariability

169

obtain an intersection set, union set, and differ-

ence set to identify the components.

3.3 Mathematical Model

The formal representation of such a model is com-

plex. The software model is composed of a set of

functions, which further contain sub-functions and so

exhibiting a hierarchical structure. The software mod-

els can be deﬁned as

P = {E, Γ} (1)

P = {p

, p

, ... p

} is a ﬁnite set of models con-

sisting of elements that forms the functional model-

ing (the abstract speciﬁcation of the components), so-

lution modeling (the implementation of the compo-

nents), and architecture design (deploying and map-

ping these components on different platforms). In ad-

dition it also contains elements that are general ratio-

nale and do not signify any of these functionality.

E = {e

, e

, ...e

} is a ﬁnite set of elements that

constitutes elements providing general information

(viz., id, time stamp, date, owner, etc.), elements that

form components, elements within the components

that represent features. Some of these elements may

be categorized as elements that describe variability or

that contribute to signify variants.

Γ = {γ

, γ

, ...γ

} is a ﬁnite set of elements which

describes complex relationships that reﬂect informa-

tion relationships, inheritance ﬂow, and message ex-

changes.

Each of these models validate to a schema; and

there is an isomorphic mapping relationship between

the elements of the schema and the models.

We deﬁne a schema S as a set of formulas that

specify integrity and constraints

S = {N, C} (2)

The schema deﬁnes the structure, entities, at-

tributes, relationships, views, indexes, packages, pro-

cedures, triggers, types, sequences, synonyms and

other elements.

N = {n

, n

, ...n

} denotes a ﬁnite set of nodes or

elements in a schema that describes integrity, whereas

C = {c

, c

, ...c

} denotes a ﬁnite set of elements in

a schema that describes constraints, and further to

adapt a heterogeneous environment which consists of

projects developed using several modeling and simu-

lation tools.

S = {s

, s

, ...s

} is a ﬁnite set of schemata each

representing a modeling or simulation tools.

At user reconﬁguration level, the software model

is represented in an abstract form, consisting of mod-

ules, functions, relationship, information, inherited

ﬂow, and message ﬂow. Subdividing the set of nodes

N and the set of constraints C into general elements

and elements that signify

N = {n, η}

C = {c, υ}

(3)

η = {η

, η

, ...η

} and υ = {υ

, υ

, ...υ

} are a

ﬁnite set of nodes and constraints respectively that

signify components, features, functions, relations,

whereas, n = {n

, n

, ...n

} and c = {c

, c

, ...c

} are

a ﬁnite set of nodes and constraints respectively that

signify all other nodes.

Targeting all nodes in the model that are isomor-

phically mapped to η and υ leads to a set of nodes

that can be viewed as a Signiﬁcant Nodes (SN). As

the functions are hierarchical the software model may

be viewed as a Signiﬁcant Node Mesh (SNM).

SN can be deﬁned as

SN = {C

, F

, N

, R} (4)

where C

= {C

, C

, ...C

} is a ﬁnite set of all

components deﬁned on the set P, ∀ C

⊂ C

and i =

1, ...m, C

is a ﬁnite set including all components of

, and is a subset of C

. F

= {F

, F

, ...F

} is a

ﬁnite set of all features deﬁned on the set P, ∀ F

c j

⊂

and j = 1, ...o, F

c j

is a ﬁnite set including all

features of p

, and is a subset of F

. N

and R denotes

the set of naming conventions and the set of relations

respectively.

Let S

denote the nodes in model P and M denotes

the nodes in schema S. Then there is a map (function)

τ from S

into M, deﬁned such that τ(n) is the deﬁni-

tion (or rule) of n ∈ S

in M.

τ : S

→ M (5)

Let S

be an element of S representing a compo-

nent c. Let E

be the subset of the schema S which is

extracted manually such that each element represents

a variant component.

= {S

∈ S : c represent a component} (6)

Let E

be the subset of a S which is extracted man-

ually such that each element represents a feature of

the component c.

= {E

∈ S : E

represents a feature

of the component c}

(7)

(i, c) denotes the i

element of E

of a component

Let C

be the subset of C such that all elements of

are represented in E

= {c ∈ C : τ(c) ∈ E

} (8)

ENASE2012-7thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering

170

Let F

be the subset of F

such that element of F

are represented in E

= { f ∈ F

: τ( f ) ∈ E

} (9)

Let F

(i, c) be the i

element of F

, where i is an

integer.

Let V be the speciﬁcation set. Then

R =

[

c∈C

[

(i, c)

!!#

V (10)

In this method the number of elements in the re-

sultant set R is



[

c∈C

[

(i, c)

!!#



(11)

On the other hand, in global search we get

V ∩ N

(12)

where N is the set of nodes in the project.

Clearly

V ∩ N

≥



[

c∈C

[

(i, c)

!!#



(13)

Hence we conclude an improved result set using

this approach.

3.4 Evaluation

The case studies targeted the design of model-based

software components ﬁrstly in an industrial use case

where the project model was developed using the

design tool ESCAPE



(Gigatronik, 2009), and sec-

ondly in a case study targeting the execution of spe-

ciﬁc paradigms based on the naming convention of

AUTOSAR



The speciﬁc project data set, which was used to

verify the implementation, consisted of a total of

32909 elements. A total of 1583 of these elements

signify components; these were categorized into 23

categories when enlisted in the component list. A to-

tal of 13353 elements signiﬁed features that were as-

signed into 12 categories.

Three different approaches were adopted to eval-

uate and determine the performance with respect to

comprehensive search. The notion of comprehen-

sive search is used, when scanning all occurrences

of the speciﬁcation set within projects, irrespective

of whether they are components or features of those

components. This may return a result set that contains

false matches.

• The evaluation using a single element speciﬁca-

tion set is illustrated in Figure 5.

• The evaluation using multiple element speciﬁca-

tion set, up to seven elements as a group is illus-

trated in Figure 6.

• The evaluation using different starting points for

elements in speciﬁcation sets is shown in Fig-

ure 7.

Figure 5: Occurrence graph for a single element speciﬁca-

tion set.

Figure 6: Occurrence graph for multiple element speciﬁca-

tion sets.

Observations:

• The comprehensive search often yielded large re-

sult sets, as it searches in individual nodes that

are treated as atomic.The result set contains every

occurrence of the speciﬁcation set, even if these

nodes do not characterize a component.

• The exhibited behavior is similar to the varying

size of the speciﬁcation set. As observed in Fig-

ure 6, the selective component-feature search re-

sult set delivers a value when the size of the spec-

iﬁcation set exceeds 3, because in this case the

matches take place across the boundary of the fea-

ture within the component. On the other hand

the other methods returns a null result set as the

search is only within the boundary of the element.

AGenericApproachfortheIdentificationofVariability

171

Figure 7: Occurrence graph for different starting points.

• The nodes representing components yield a result

set which is somewhat realistic, though these do

not epitomize the complete set desired.

• These nodes along with the feature set yield a

more elaborate result set. A match contained by

any node in a set of features would result in rep-

resenting the component to which it belongs.

• For any given size of the speciﬁcation set, the se-

lective component-feature search returns a much

smaller result set and is more precise.

• Convergence is optimal with a speciﬁcation set of

size 3. If the size of the speciﬁcation is too large

the result may be null for both methods as shown

in Figure 6.

• To determine the effect of different starting points,

a multiple-element speciﬁcation set was used,

where the orders of the elements were changed to

obtain ﬁve sets. The result set for this exhibits the

same pattern as the two experiments above.

4 CONCLUSIONS

An approach that can signiﬁcantly improve the iden-

tiﬁcation of variant is proposed by targeting signiﬁ-

cant nodes instead of comprehensive search. The ap-

proach reﬂect both the capability to match keywords

and to reﬂect the structure that characterizes a com-

ponent enabling the identiﬁcation in large distributed

and heterogeneous development environment. The

developed prototype is itself independent of a speciﬁc

tool as it works on textual descriptions that typically

are available in XML. Although the accuracy of the

retrieved set of candidates is highly improved. The

future work may comprise to extend the concept to

specify and verify reusable components.

REFERENCES

Bachmann, F. and Clements, P. C. (2005). Variability in

software product lines. Technical Report -CMU/SEI-

2005-TR-012.

Bosch, J. (2000). Design and Use of Software Architec-

tures: Adopting and Evolving a Product-Line Ap-

proach. Addison-Wesley.

Clements, P. and Northrop, L. (2007). Software Product

Lines: Practices and Patterns. Addison-Wesley.

Crnkovic, I. (2005). Component-based software engineer-

ing for embedded systems. Software Engineering,

ICSE 2005. Proceedings. 27th International Confer-

ence, pages 712–713.

Frank, A. and Brenner, E. (2010a). Model-based variability

management for complex embedded networks. 2010

Fifth International Multi-conference on Computing in

the Global Information Technology, pages 305–309.

Frank, A. and Brenner, E. (2010b). Strategy for modeling

variability in conﬁgurable software. Programmable

Devices and Embedded Systems PDES 2010.

Galster, M. and Avgeriou, P. (2011). Handling variability

in software architecture: Problem and implications.

2011 Ninth Working IEEE/IFIP Confernce on Soft-

ware Architecture, pages 171–180.

Gigatronik (2009). Escape. http://www.gigatronik

2.de/index.php?seite=escape produktinfos de &nav-

igation=3019&root=192&kanal.html.

Gomaa, H. and Webber, D. (2004). Modeling adaptive and

evolvable software product lines using the variation

point model. Proceedings of the 37th Hawaii interna-

tional Conference on System Sciences, Washington.

Heymans, P. and Trigaux, J. (2003). Software product line:

state of the art. Technical report for PLENTY project,

Institut d’Informatique FUNDP, Namur.

Kulesza, U., Alves, V., Garcia, A., Neto, A. C., Cirilo1,

E., de Lucena, C. J. P., and Borba, P. (2007). Map-

ping features to aspects: A model-based generative

approach. Current Challenges and Future Directions,

Lecture Notes in Computer Science, pages 155–174.

Kum, D., Park, G., Lee, S., and Jung, W. (2008). Autosar

migration from existing automotive software. Inter-

national Conference on Control, Automation and Sys-

tems, pages 558–562.

Oliveira, E., Gimenes, I., Huzita, E., and Maldonado, J.

(2005). A variability management process for soft-

ware product lines. CASCON 05, pages 225 – 241.

Szyperski, C. (2002). Component software: Beyond object-

oriented programming. 2nd Edition, Addison-Wesley,

USA.

ENASE2012-7thInternationalConferenceonEvaluationofNovelSoftwareApproachestoSoftwareEngineering

172