Static Analysis and Testing of Executable DSL Specification

Qinan Lai and Andy Carpenter

School of Computer Science, The University of Manchester, Manchester, United Kingdom

Keywords: ALF, fUML, DSL, Modelling Language, Behavioural Semantics, Static Check.

Abstract: In model-driven software engineering, the syntax of a modelling language is defined as a meta-model, and

its semantics is defined by some other formal languages. As the languages for defining syntax and semantics

comes from different technology space, maintaining the correctness and consistency of a language

specification is a challenging topic. Technologies on formal methods or sophisticated dynamic verification

have been developed to verify a language specification. While these works are valuable, they can be hard to

apply to a complex language in reality. In this paper, extended static checking and testing are used to

maintain the correctness of a language specification, and the techniques are applied to a case study that

formalises WS-BPEL to a model-based specification defined by OMG standard fUML and ALF. Several

categories of different errors are identified which can happen during semantics development, and how our

framework can simplify the checking on them by static checking and direct testing of executable models is

discussed.

1 INTRODUCTION

It is common to need to create a new Domain

Specific Language (DSL) and a set of supporting

tools. Model-driven technologies address several

aspects of the development of a DSL, for example

EMF/Xtext (Efftinge and Völter, 2006)/GMF

support syntax development and OCL allows the

definition of static semantics. Experience has

showed that basing tool development on model-

driven technologies is simpler and faster than

traditional language parser/compiler or interpreter

approaches. Recently researchers have sought ways

to extend the use model-driven technologies to

definition of the behavioural semantics of a DSL

(Qinan and Carpenter, 2012); (Briand et al., 2005);

(Scheidgen and JFischer, 2007).

In practice, a DSL specification usually defines

its syntax and semantics in an abstract way. Tools to

support a DSL are built by implementing an

interpretation of this specification. Compared to

other means of defining DSL specification, a model-

based description has the advantage that it is both

human understandable and machine processable. By

exploiting the generation aspects of model-driven

engineering tooling implementations can be created

directly from the specification eliminating the

possibility of interpretation errors.

However, even when using a model-based

approach, the different aspects of a DSL defined

separately against independent meta-models. This

means that there is not a tool that can consider all

aspects of the DSL specification and identify, for

example, inconsistencies between them or errors in

embedded specifications. This is a known source of

errors; for example for many years the OCL

constraints embedded in the UML superstructure

specification contained more than a hundred syntax

errors (Wilke and Demuth, 2011), which were

eventually removed in UML 2.4 beta version.

In this paper, a unified and formalised definition

of the Business Process Execution Language DSL is

created using the Action Language for fUML (ALF)

(Object Management Group, 2010). This

specification forms a case study that is used to

identify the kinds of errors can happen while

creating a DSL definition. From the types of errors

static checks to programmatically identify errors are

being developed. The aim is to exploit the unified

description to reduce the effort needed to create

error free DSL tooling.

The contributions of the paper are: (1)

Identification of the seven categories of common

error patterns that can appear in a DSL specification;

these patterns are introduced in section 2. It is

identified that most of them are small errors, and

157

Lai Q. and Carpenter A..

Static Analysis and Testing of Executable DSL Speciﬁcation.

DOI: 10.5220/0004344401570162

In Proceedings of the 1st International Conference on Model-Driven Engineering and Software Development (MODELSWARD-2013), pages 157-162

ISBN: 978-989-8565-42-6

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

simple automatic technologies can counter them. (2)

An extensible framework that performs static

analysis and testing of a DSL specification is

proposed to check these errors. The framework uses

an extended ALF language (Object Management

Group, 2010) to compose a DSL; currently it could

check inconsistency/syntax and many bad practices

on a DSL specification. The framework also

supports to generate an EMF based DSL interpreter

prototype, which could be used to test logic and

runtime errors.

2 ERRORS AND BAD PRACTISES

IN DSL SPECIFICATION

In this section, the context of how the errors are

identified is given. The WS-BPEL language is a

DSL aimed for web service composition. Its syntax

is defined as XML, and its semantics is defined by

natural language, but several works tried to

formalise it (Fahland and Reisig, 2005). We tried to

formalise it to a model-based specification, which

means creating a MOF based meta-model,

formalising the well-formed rules to OCL and

modelling the behavioural semantics of an

operational language. The framework of defining the

BPEL specification is based on our previous work

(Lai and Carpenter, 2012) The meta-model, the OCL

constraints and the behaviours are all defined as

ALF programs.

The meta-model of BPEL is created by

translating the Ecore model of BPEL from Eclipse

BPEL designer project to ALF structures.

Figure 1 introduces how our framework could

specify, testing and statically check a complete DSL

specification. Firstly the BPEL meta-model is

defined as an ALF program. ALF syntax for UML

units modelling captures the meta-model, and the

ALF statements and expressions captures the

behavioural semantics. The ALF program is defined

in an Xtext-based editor, which also provides static

checkers which could report checkable errors to the

DSL designer. By testing the ALF program through

a generated EMF application, new errors could be

found, and new static checkers could be created and

integrated to the framework with minimal effort.

2.1 Build ALF Executor as a Code

Generator

The ALF open source implementation can directly

execute ALF programs. However in our experience,

the software is not easy to use. Firstly it does not

support some necessary concepts, such as

inheritance of signal receptions and operation

overloading. Secondly the error message given by it

is not clear enough, for example, for many different

types of errors it will always report internal

reference errors. Practically we try to execute the

DSL specification by transforming the ALF based

spec to an Ecore model, and directly map the

operation body to Java code which embedded as the

Genmodel annotations. Thus EMF will generate a

Figure 1: Framework overview.

DSLspecification

Abstractsyntax

Staticsemantics

Behavioural

semantics

MDEtechnologyspace

ALFwithOCL

Concretesyntax

Definedby

DSLdesigner

Create

Modify

Definedby

Ecore

Generate

Test

Staticcheckers

UML

Generate

Check

Errorsandfeedback

Result

Testing

Reference

implementation

MODELSWARD2013-InternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment

158

Java application that has a one-to-one mapping to

the concepts defined in the specification. Some

concepts that do not have a one to one mapping are

tried to generate semantically similar code, for

example, translating ALF active class as a class

implements

Runnable interface, and start an active

class become creating a thread.

By generating Ecore model with Genmodel

annotations, the model editor can be reused to create

DSL testing models. As a result, a prototype of DSL

interpreter and code editor based on EMF are

generated from the DSL specification.

2.2 Identifying Errors by Testing

Executable DSL Specification

In this process of defining BPEL, the errors met

were documented. The process of error identification

and the creation of static checkers work as below.

Firstly, while developing the ALF programs, test

cases were created and the ALF programs are tested

by testing the generated EMF application. In this

process, many types of error could happen. The code

generator could generate wrong code, or there could

be errors in the ALF program. The errors that

happened in the ALF program, in other words, the

BPEL specification was relevant to this paper. Once

such an error was identified, they were documented.

And then the source and reason of the error was

analysed. Finally, static checkers were created and

dropped to the ALF editor, so the same types of

error would be eliminated or reduced.

It was identified that these errors were easily

introduced. If there was no static checker, they

would happen again and again. In summary, 32 error

patterns are identified, and they can be categorised

as the following 7 kinds of errors. The principle of

the categorisation was based on the source and the

reason of the errors.

2.3 Errors Identified in BPEL Case

Study

The syntax error is the most common kind of error.

It includes wrong syntax, type mismatch and any

violations on the well-formedness of the modelling

languages. Despite they are not hard to check, due to

the fact that behavioural semantics are defined in

another technology space, the tools that can take all

the kinds of errors into consideration is not valid.

Inconsistency errors can happen between the

definitions of different aspects of a DSL. The first

type of consistency is horizontal consistency, which

can happen when the meta-model, the static

semantics and the behavioural semantics referred to

an invalid concept. Vertical inconsistency may

happen when the meta-model changed, but the

model that conforms to the meta-model does not

change. Both horizontal and vertical inconsistency

can easily happen when the DSL specification

evolves. A small rename of one class in the meta-

model can cause all the semantics models that

referred to that model become invalid.

Figure 2: Inconsistency example.

The example in Figure 2 shows an example of

syntax and inconsistency error. The OCL invariance

called activity property, but in the meta-model it is

called activities. In the behavioural specification, the

run() operation is invoked, however in the meta-

model such an operation is undefined.

Conflicting errors can happen in static semantics

definition, where invariance on the meta-model

conflicting with each other. It can also happen if the

pre- and post-conditions of an operation is

conflicting with the static semantics. It is also

possible that the invariance of the meta-model

conflicts that leads to an unsatisfiable model, which

means there is no model which could be instantiate

that conforms to the meta-model.

Deficiency can happen when the DSL specification

lacks some certain properties. One common category

of bad practice is unused concepts or undefined

operation stubs. Another deficiency error is signal

deficiency, which could happen in the behavioural

semantics definition when the active class and signal

models are used. Consider the example ALF code:

public active class Execution {

public receive signal SignalStart{}

}do{

accept(SignalStart){

//do something

}

StaticAnalysisandTestingofExecutableDSLSpecification

159

When the class Execution is instantiated, it will wait

for other objects to send a

SignalStart then it will

continue. If this signal is not sent, the active object

goes to deadlock due to lack of signal.

Extended static errors are defined as the errors that

can be checked by static analysis, but they do not

belong to the syntax. In fact, many bad code

practices and errors belong to this category. For

example:

Comparing multiple valued variable with null

if (structuedActivity.activities==null)

should be

if(strucutedActivity.activities

->isEmpty())

The “instanceof expression always return false”

is another example, take the same meta-model in

Figure 1, and assume that

process is an instance of

Process

if (process instanceof Activity){}

the condition of the if statement will always remain

as false.

These kinds of errors are usually platform-specific to

ALF language. However, considering the action

languages for behavioural modelling share some

common design principles and even syntax are

similar. They are usually able to direct manipulating

models, have higher abstraction level and support

OCL-like syntax, the principles of static errors can

be adapted to other languages.

Platform specific errors can happen when the

developers wish to use the DSL standard as platform

independent models, and generate platform specific

models from it. For example, if the developers want

to generate a Java-based interpreter of the DSL, the

DSL models must avoid names preserved in Java. If

the model in Figure 2 is used to generate Java code,

it will override

java.lang.Process class and

result compiling error.

Another example is to enforce the naming rules

of Java. Any string could be legal names in ALF,

however, this lead to compile errors or code that are

hard to understand.

Logic errors and runtime errors can still happen,

and they are easy to identify by testing rather than

static checking.

3 STATIC ANALYSIS

AND TESTING ON DSL

SPECIFICATION

These errors identified in section 2 should be

avoided by some automatic technology, and when

developing a DSL specification, the developers

should apply automatic checking technology. We

designed a framework and which could specify a

complete DSL specification, and then perform static

analysis of the semantics specification to check

syntax error, inconsistency error and other static

errors. Logic errors are also testable by directly

execute the specification. Our framework uses the

syntax of ALF language plus adding OCL

annotations to it. The meta-model is defined as ALF

units. The OCL constraints are specified as an

annotation. The behavioural semantics are defined as

activities and operations. The framework of

specification and analysis is developed using Xtext.

The architecture of the static analysis is listed in

Figure 3. Different kinds of errors can be checked by

integrating relevant analysing technology. Because

the specification is defined by ALF language, errors

can naturally checked by Xtext validators.

Figure 3: Error checking and testing.

MODELSWARD2013-InternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment

160

In such a specification, abstract syntax, static

semantics and behavioural semantics are defined in a

single model-driven technical space. Unlike defining

them in different technical spaces that are hard to

check the consistency, the syntax errors and

inconsistency errors can be easily detected. The

detection of inconsistency and syntax errors become

the same problem of checking the validity of ALF

programs. By defining the grammar of ALF and

resolving the internal references, Xtext can report

syntax and inconsistency errors while editing the

ALF program.

The Xtext validator will check the errors that are

checkable in ALF domain. By using the extension

points of EMF plugin, it is possible to integrate other

types of validators. Currently the framework

supports to invoke OCL validator, other validators

are still under development.

The Xtext validator works as below: syntax

errors can appear in ALF text or OCL text. Xtext

will automatically check the errors which could be

checked by the parser. A type system is developed to

check type errors in the expressions. Separate

validator rules are defined to check well-formedness

rules, for example, an operation with a return type

must have a return statement in the entire execution

path. OCL syntax errors are checked by invoking

OCL validator in EMF. Extended static errors and

platform specific errors can be checked by the same

principle. All the static checkers require tens to

hundreds of lines of code, which are not hard to

create, but it showed that the checkers could

significantly reduce the errors in the specification.

Most logical and runtime errors are hard if not

possible to check by static analysis technologies.

However, some particular kinds of runtime errors

can be checker, for example, null pointer

dereference, impossible or redundant type cast.

4 FURTHERWORK

There are several unfinished works. There are still

some static checkers that are under research.

Conflicting errors are not directly checkable by

Xtext validation rules. One possible way to check it

is to translate the DSL spec to another analysis

domain and map the analysis result back to the users.

UMLtoCSP (Cabot et al., 2007) is a tool which can

check OCL conflicts. Currently we are working on

how to use this tool to report conflict errors. Because

this process contains translations, how to back

annotate the error message produced by the analysis

domain to the definition domain remains to be

researched.

Some Deficiency errors such as unused models

or empty stubs can be easily checked by our

framework. Currently our approach for checking

signal deficiency is a lightweight approach, which

only report error when one active class accepts some

signal, but there is no object that has sent these

signals. We wish to seek other ways that can check

more complex cases.

Currently the generation of DSL interpreter does

not support all the concepts defined in ALF standard,

it does not support direct use of OCL-like

expressions, the code still need some manual work

to test. It is interested to fully automate the

generation of an interpreter with no limitations.

The framework to use a unified definition to

define, check and test a DSL specification is only

tested by one case study. The behavioural semantics

is based on the imperative paradigm. It is necessary

to test whether the same technique can be applied to

declarative languages, because there is a large

number of DSLs which are declarative languages. It

is planned that to carry out another case study for

creating a model-based specification for a small

functional programming language.

5 CONCLUSIONS

In this paper, the correctness issue of a DSL

specification has been discussed. Seven categories of

error that can occur during the development of a

specification have been identified and introduced. It

has been demonstrated that most of these errors can

be detected using a simple static checker, making

their removal from specifications a trivial task. The

use of generating an implementation from a

specification has also been described. This has the

advantage of eliminating interpretation errors from

the process of creating DSL tooling. Finally an

extensible framework that brings together the

integration of static checks and the generation of

implementations has been outlined.

REFERENCE

Lionel Briand, Clay Williams, Pierre-Alain Muller, Franck

Fleurey, and Jean-Marc Jézéquel. Weaving

Executability into Object-Oriented Meta-languages,

volume 3713 of Lecture Notes in Computer Science,

pages 264–278. Springer Berlin / Heidelberg, 2005.

Jordi Cabot, Robert Clarisó, and Daniel Riera. Umltocsp:

a tool for the formal verification of uml/ocl models

StaticAnalysisandTestingofExecutableDSLSpecification

161

using constraint programming. In Proceedings of the

twenty-second IEEE/ACM international conference on

Automated software engineering, ASE ’07, pages

547–548, New York, NY, USA, 2007. ACM.

S. Efftinge and M. Völter. oaw Xtext: A framework for

textual dsls. In Workshop on Modeling Symposium at

Eclipse Summit, volume 32, 2006.

D. Fahland and W. Reisig. Asm-based semantics for bpel:

The negative control flow. In Proc. 12th International

Workshop on Abstract State Machines, pages 131–151.

Citeseer, 2005.

Object Management Group. Action language for

foundational uml (alf) 1.0 - beta 1.

www.omg.org/spec/ALF/, 2010.

Qinan Lai and Andy Carpenter. Defining and verifying

behaviour of domain specific language with fuml. In

Proceedings of the Fourth Workshop on Behaviour

Modelling - Foundations and Applications, BM-

FA ’12, pages 1:1–1:7, New York, NY, USA, 2012.

ACM.

Andreas Prinz, Markus Scheidgen, and Merete Tveit. A

model-based standard for sdl. In Emmanuel Gaudin,

Elie Najm, and Rick Reed, editors, SDL 2007: Design

for Dependable Systems, volume 4745 of Lecture

Notes in Computer Science, pages 1–18. Springer

Berlin / Heidelberg, 2007.

Markus Scheidgen and Joachim Fischer. Human

comprehensible and machine processable

specifications of operational semantics. In

Proceedings of the 3rd European conference on Model

driven architecture-foundations and applications,

ECMDA-FA’07, pages 157–171, Berlin, Heidelberg,

2007. Springer-Verlag.

C. Wilke and B. Demuth. Uml is still inconsistent! how to

improve ocl constraints in the uml 2.3 superstructure.

Electronic Communications of the EASST, 44, 2011.

MODELSWARD2013-InternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment

162