Verifying OCL Operational Contracts via SMT-based Synthesising

Hao Wu and Joseph Timoney

Computer Science Department, Maynooth University, Ireland

Keywords:

OCL Synthesis, Call Sequence, SMT.

Abstract:

The set of operational contracts written in the Object Constraint Language can be used to describe the be-

haviour of a system. These contracts are speciﬁed as pre/post conditions to constrain inputs and outputs

of operation calls deﬁned in a UML class diagram. Hence, a sequence of operation calls conforming to

pre/postconditions is crucial to analyse, verify and understand the behaviour of a system. In this paper, we

present a new technique for synthesising property-based call sequences from a set of operational contracts.

This technique works by reducing a synthesis problem to a satisﬁability modulo theories (SMT) problem. We

distinguish our technique from existing approaches by introducing a novel encoding that supports high levels

of expressiveness, ﬂexibility and performance. This encoding not only allows us to synthesise call sequences

at a much larger scale but also maintains high performance. The evaluation results show that our technique is

effective and scales reasonably well.

1 INTRODUCTION

UML models are central to Model Driven Engineer-

ing (MDE) based software development. They pro-

vide software engineers with different ways of visu-

alising structural and behavioural aspects of a system.

For example, UML class diagrams are used to depict

entities, attributes and relationships in a system. On

the other hand, Object Constraint Language (OCL)

is designed to describe formal rules or queries that

cannot be captured by UML models. For example,

a pre/postcondition written in OCL can constrain the

inputs and outputs of an operation call deﬁned in a

class diagram. The combination of UML and OCL

is widely used for modelling a software system not

only because of their expressiveness but also formal-

ity. Hence, the correctness of models annotated with

OCL are crucial for MDE based software develop-

ment. However, the tasks of verifying UML along

with OCL constraints remain a challenge in the mod-

elling community.

Though numerous approaches have been proposed

to tackle this challenge (Berardi et al., 2005; B

uttner

et al., 2012; Cabot et al., 2014; Wille et al., 2012),

many of them focus on verifying UML class diagrams

annotated with a set of OCL class invariants (struc-

tural aspects of a system). In general, OCL opera-

tional contracts (behavioural aspects of a system) are

speciﬁed as pre/post conditions of an operation call.

These pre/postconditions capture constraints over a

set of call sequences and system states. For example,

ﬁnding a sequence with respect to pre/postconditions

is the same as checking whether a valid system

state can be derived. In this paper, we introduce a

novel technique that allows us to verify OCL opera-

tional contracts via synthesising property-based call

sequences. Our technique works by reducing it to a

satisﬁability modulo theories (SMT) problem. More

speciﬁcally, our synthesis technique provides a high

level of expressiveness and ﬂexibility for not only

data types but also property-based synthesis.

In general, our synthesis constraints are in ﬁrst-

order form and this gives us several advantages; (1)

We can perform efﬁcient satisﬁability checks by tak-

ing full advantages of SMT ﬁrst-order reasoning ca-

pabilities. (2) It allows us to specify queries and prop-

erties over unrolled states without regenerating the

synthesis constraints.

Contributions. In particular, we ﬁrst present our

initial idea in (Wu, 2019) and we now show our full

technical details including detailed SMT-encodings in

this paper. Hence, our contributions of this paper are

summarized as follows:

1. We design our synthesis constraints to be expres-

sive enough so that users can use quantiﬁed in-

variants and queries over a collection of transitions

(Section 4).

Wu, H. and Timoney, J.

Verifying OCL Operational Contracts via SMT-based Synthesising.

DOI: 10.5220/0009340602490259

In Proceedings of the 8th International Conference on Model-Driven Engineering and Software Development (MODELSWARD 2020), pages 249-259

ISBN: 978-989-758-400-8; ISSN: 2184-4348

249

2. We then introduce a set of different properties

based on an intermediate representation (Section

5). To optimise overall formula size and perfor-

mance, we also introduce three simpliﬁcation rules

(Section 6).

3. We evaluate our technique on a benchmark but with

much larger sequences, and show that our tech-

nique outperforms existing approaches in terms of

expressiveness, ﬂexibility and performance (Sec-

tion 7).

2 A RUNNING EXAMPLE

In this section, we introduce a real world scenario

that a software engineer uses a UML class diagram

(as shown in Figure 1) to model an online shopping

system. This example will be used throughout this

paper to illustrate our approach. Except for the at-

tributes and classes depicted in Figure 1. This en-

gineer also deﬁnes 5 operation calls to further con-

strain the model. These operation calls allow users

to change their shoppingcart, checkout, conﬁrm and

cancel

their orders. For each operation call, the

corresponding pre or postcondition(s) are also de-

ﬁned. For example, the checkout operation requires a

shoppingcart to be non-empty before a user can sub-

mit an order.

In general, it is very easy to introduce a mistake or

omit a condition when writing a speciﬁcation. Unfor-

tunately, the class diagram presented in Figure 1 also

contains a mistake. The operational contracts here al-

low a scenario that after an order is placed and con-

ﬁrmed, users can still modify their orders. In fact, this

mistake is quite difﬁcult to be identiﬁed. However,

with the help of visualised call sequences this mistake

becomes obvious. It can now be spotted in the synthe-

sised call sequence shown in Figure 2. One way to ﬁx

this mistake is to add sel f .cart.con f irmed = f alse as

precondition for both addItem and removeItem.

3 BACKGROUND

Before presenting our techniques, we ﬁrst review the

background here. Formally, synthesising a call se-

quence G

hk,Pi

from OCL operational contracts is de-

ﬁned in (Wu, 2019), and it is shown as follows:

hk,Pi

def

= Φ ∧ Ψ

where k is the length of the sequence, P describes a set

of properties, Φ describes synthesis constraints and Ψ

Due to space restrictions, the operation cancel is not

shown here.

is a set of property constraints. Visually, a sequence

hk,Pi

can be viewed as follows:

hk,Pi

: S

−−−→ S

···

− → S

where each δ

in a sequence represents an opera-

tion call selected from the set of operation calls (de-

noted as ∆), and each S

is a (system) state that is

triggered by an δ

in the previous state S

j−1

. S

is the

initial state and S

is the ﬁnal state that is derived by

the kth operation call (δ

) in a sequence

. In each

state S

, there is exactly one operation call δ

invoked

to make the transition to the next state S

j+1

. There-

fore, this transition captures all possible sequences

that have a length of k.

For example, we can deﬁne a synthesised call se-

quence G

h6,Pi

(with a length of 6) for our online shop-

ping model in Figure 1 with the following properties:

• addItem and removeItem cannot be called after

con f irmOrder.

• each operation must be called at least once.

• checkout and con f irmOrder must be part of the

sequence.

To check whether there exists a counter-example, we

negate Ψ and discover a sequence (shown in Figure

2) showing a scenario that a customer may add more

items after conﬁrming an order.

4 BASIC SYNTHESIS

Our synthesis constraints (Φ) consists of a series of

formulas φ

hi, ji

and each encodes a δ

(ith operation

call) including pre/post conditions at state j.

Since a pre/post condition may refer to a set of

speciﬁc features in a class diagram such as attributes

and collections, our synthesis constraints also encode

those features. In fact, the encoding of a feature is

similar to the encoding described in (Wu, 2019). To

be precise, we use a feature function to encode an at-

tribute or a collection data type at a speciﬁc state. For

example, F

(m, 2) = 1001 means that a feature func-

tion F

sets the attribute id of an object m

to 1001

at state 2.

Quantiﬁed Invariants. To encode a class invariant,

we use a quantiﬁed formula. This encoding allows us

to reduce the number of individual unrolled class in-

variants in each state to a single quantiﬁed formula.

For example, the class invariant in Figure 1 is repre-

sented using the following formula:

Note that δ

is called in S

k−1

and triggers S

. Thus, a

call sequence of length of k creates a total of k + 1 states.

Each object is encoded as a unique integer. Here, m

represents an instance of MembershipCard.

MODELSWARD 2020 - 8th International Conference on Model-Driven Engineering and Software Development

250

Figure 1: A UML class diagram annotated with a set of OCL operational contracts models an online shopping system. inv

here speciﬁes a class invariant. pre and post specify preconditions and postconditions of an operation respectively. @pre here

denotes an object from previous state.

Figure 2: A call sequence shows a scenario that a customer may modify items after conﬁrming an order.

∀c1 : Customer,c2 : Customer,s : INT.



c1 6= c2



⇒



card

(c1, s) 6= F

card

(c2, s)



where s here is constrained as 0 ≤ s ≤ k. This formula

captures the semantics that no two customers (c1 and

c2) can have the same card number in any state (de-

noted by s)

. This encoding gives us an advantage

of producing compact formulas regardless of the in-

creasing number of sequence length (k).

Queries. Similarly, users can also use quantiﬁers to

specify queries over a set of states for speciﬁc purpose

checking without affecting synthesis constraints. This

signiﬁcantly improves ﬂexibility for users to debug

and analyse a sequence without explicitly unrolling

a query in each state. For example, a user can use

the following query to check whether the number of

items in a shoppingcart (in Figure 1) from a section of

states is non-negative (0 ≤ i ≤ j).

∀s : INT.



i ≤ s ≤ j



⇒



Size(items, s) ≥ 0



where Size is deﬁned as: (Bag INT) × INT → INT

and it returns the number of elements in a collection

at a particular state s. items encodes a collection of

objects that are represented as a set of unique integers.

The following diagram shows this query from state i

to j. Note that the affected calls of this call sequence

are δ

, δ

i+1

. .. δ

−−−→ . . .S

i+1

... δ

−−−−−−−−→ S

| {z }

query:|items|≥0

j+1

... δ

−−−−−−−−−→ S

Synthesis Constraints. To formally construct syn-

thesis constraints, we simply conjoin each δ

’s

Both c1 and c2 have their own unique object ids.

pre/post conditions at state j with class invariants.

In general, our synthesis constraint is in ﬁrst-order

form and it allows us to use unbounded collections

and quantiﬁed invariants. For example, the two post-

conditions deﬁned for the 1st operation addItem() in

Figure 1 at state j is encoded as follows:

h1, ji

def



∃x : INT.



Select(items, x, j)



= item0



∧



Size(items, j) = Size(items, j − 1) + 1



We use an existential quantiﬁer to encode the includes

operation, x denotes a speciﬁc position in a collec-

tion and j refers to current state and j − 1 to previous

state. Therefore, a transition can be established be-

tween two states via our synthesis constraints.

4.1 Transitions

Typically a transition from state S

j−1

to S

is triggered

by a single operation call encoded by φ

hi, j−1i

, assum-

ing that no two calls can occur concurrently. To en-

code the transitions, we introduce a control variable

hi, ji

for each φ

hi, ji

. Each control variable controls

the selection of its corresponding φ

hi, ji

. More specif-

ically, we constrain each control variable to be either

0 or 1. We say that φ

hi, ji

is enabled (δ

at state j is

called) only if cv

hi, ji

is assigned to 1. In this way, we

now can constrain the number of a selected operation

calls at state j. Therefore, the following formula ex-

press the meaning of selecting exactly one operation

call from ∆ at each state.

Verifying OCL Operational Contracts via SMT-based Synthesising

251

items:size

post2 post4

post5 post6

shoppingcart:

submitted

shoppingcart:

conﬁrmed

Figure 3: The dependency graph for postconditions deﬁned

in Figure 1.



|∆|

i=1

j=0



hi, ji

= 1



⇒ φ

hi, ji



∧



j=0



|∆|

∑

i=1

hi, ji



= 1



(1)

where cv

hi, ji

∈ {0, 1}. On the other hand, if a control

variable at state j is set to 0, then the corresponding

operation call δ

at that state is not selected. This im-

plies that the attribute (affected by δ

) of an object o

(speciﬁed in a postcondition) has not been changed

(since we have not selected δ

). To compute the set

of unchanged attributes, we work out the number of

postconditions that affect an object’s attributes via

constructing a dependency graph as illustrated in Fig-

ure 3.

Let D be the set of nodes representing postcondi-

tions in a dependency graph. Each d ∈ D is a speciﬁc

postcondition of δ

. We let T be the set of nodes that

each d depends on. Now we can use the Formula 2 to

express the set of attributes that have been changed

j=1



d∈D



h[d], ji

= 0



⇒

t∈T

[t]

(o, j) = F

[t]

(o, j − 1)



(2)

where F denotes a feature function and [ ] maps a node

to the ith operation call or a speciﬁc feature. For ex-

ample, we can construct a dependency graph (shown

as Figure 3) for our online shopping model. The post-

condition 2 and 4 (post2 and post4 in Figure 3) in

Figure 1 both depend on the size of the object items

(since they change the number of items). We can say

that when both postconditions 2 and 4 are not applied,

then the number of items is not affected.

Though the rule here only works on explicit changes,

implicit ones can be further extracted using the technique

presented in (Niemann et al., 2015).

Note that the operations: includes and excludes only

check the content of a collection rather than modifying its

content.

5 PROPERTY-BASED SYNTHESIS

In this section, we introduce a technique that allows

us to synthesise a call sequence with respect to a set of

properties. This technique uses a boolean matrix M as

an intermediate representation for separating the syn-

thesis constraints (Φ) from the property constraints

(Ψ). Each property constraint (ψ

) is then applied to

constrain this boolean matrix. This provides a degree

of ﬂexibility that enables editing or expanding prop-

erties over a call sequence without affecting our syn-

thesis constraints. This matrix M is shown in Figure

4. Essentially, this matrix captures all available oper-

ation calls from the initial state (S

) to state S

k−1

. The

row of M represents all possible operation calls. The

column of M indicates the transitional states. Each

entry cv

hi, ji

in the matrix is a control variable that de-

notes an operation call δ

at state j. For example, the

call sequence in Figure 2 is represented in the matrix

in Figure 5.

5.1 Called-before

In many scenarios, a speciﬁc order of a series of calls

is critical for verifying the behaviours of a system.

For example, the addItem operation call in Figure 1

must be called-before the con f irmOrder. Many po-

tential problems in a design can be identiﬁed when

an incorrect order is presented. In this subsection, we

introduce a called-before property.

Given two operation calls δ

and δ

, the no-

tation δ

→ δ

deﬁnes a constraint that δ

must

be called-before δ

. For example, addItem →

con f irmOrder deﬁnes that addItem must be invoked

before con f irmOrder. In order to encode this con-

straint, we look at it in an opposite way. A called-

before relation δ

→ δ

implies that an operation call

must not be called before an operation call δ

, that

is δ

6→ δ

. Thus, our encoding here works by block-

ing the possibilities of an operation call δ

called be-

fore δ

. More precisely, given a called-before se-

quence G

: δ

→ δ

→ . . . δ

. . . → δ

Let x denote the xth δ in G

. We now can start

from the nth δ in G

and prevent each δ before nth

position from being called until the 1st δ is reached.

This constraint applies through from the initial state to

the second last state since the number of states (where

an operation is being called) is bounded from 0 to k −

Note that δ

here denotes the 1st operation call in G

That does not mean it also denotes the 1st operation call in

∆.

MODELSWARD 2020 - 8th International Conference on Model-Driven Engineering and Software Development

252







... S

k−1

h1,0i

h1,1i

. . . cv

h1,k−1i

h2,0i

h2,1i

. . . cv

h2,k−1i

h3,0i

h3,1i

. . . cv

h3,k−1i

hn,0i

hn,1i

. . . cv

hn,k−1i







Figure 4: A boolean matrix M.







addItem 1 1 0 0 0 0

removeItem 0 0 0 0 1 0

checkout 0 0 1 0 0 0

con f irmOrder 0 0 0 1 0 0

cancel 0 0 0 0 0 1







Figure 5: A matrix representing the call sequence in Figure

1 (excluding the last state) for a sequence that has a

length of k.

x=n

k−2

i=0





hJxK,ii

= 1



⇒



y=x−1

k−1

j=i+1

hJyK, ji

= 0





(3)

Formula 3 captures this constraint and JK here maps

the ith δ in G

to the jth δ in ∆. For exam-

ple, Figure 6 shows an example of con f irmOrder 6→

(addItem ∧ removeItem). That is, if con f irmOrder

(blue shaded area) is called at state S

, then addItem

and removeItem (red shaded area) cannot be called

from state S

to S







addItem x x x x x x

removeItem x x x x x x

checkout x x x x x x

con f irmOrder x x x x x x

cancel x x x x x x







Figure 6: An example matrix shows that con f irmOrder is

called at state 2. This implies no addItem or removeItem

should be called in the subsequent states. Each x in the

matrix represents a control variable. The control variables

in the red shaded area are switched off while the control

variable in the blue shaded area is switched on.

5.2 Called Exactly n Times

In general, we use Formula 4 to capture that an opera-

tion call occurs exactly n times within a sequence. Let

hc, ji

denote a control variable that represents the cth

operation call at state j. We can count the number of

control variables cv

hc, ji

from the initial state to state

k − 1, and constrain this number to be n. Formula 4

provides a general form for other numeric constraints

such as ensuring that an operation call must occur in a

sequence at least n or no more than n times. This can

be achieved by setting = in the Formula 4 to ≥, ≤, >

and <.

k−1

∑

j=0

hc, ji

= n, where c is a constant and n ≤ k − 1.

(4)

5.3 Full Reachability

We can easily extend the called exactly n times prop-

erty to check whether every single operation call δ

deﬁned in ∆ can be reached by some operations within

a sequence at least once. We say such a sequence has

full reachability. This is achieved by using Formula

5. This property is quite useful because it allows users

to synthesise a call sequence that covers each opera-

tion call at least once. Therefore, it also requires a

sequence call to be a minimum length of |∆|.

|∆|

i=1

k−1

j=0

hi, ji

= 1, where k ≥ |∆|. (5)

For Formula 5, we can also ﬁx j to speciﬁcally check

whether an operation call can be reached at the spe-

ciﬁc position in a sequence. For example, removeItem

cannot be placed at the beginning of a sequence, if a

shoppingcart in the initial state is empty.

5.4 Partial Sequence

In many situations, users may like to have a property

that they can perform analysis on a certain ﬁxed por-

tion of a call sequence. We say such sequence is par-

tially known or a partial sequence. A partial sequence

is a subsequence L = hδ

1+i

, . . . , δ

m+i

i of a sequence

S = hδ

, δ

, . . . , δ

i such that i ≥ 0 and m + i ≤ k.

This property requires that every synthesised call se-

quence must contain that partial sequence. Thus, this

allows users to analyse the behaviours of a system un-

der a particular portion of a call sequence. For ex-

ample, Figure 2 shows that this synthesised call se-

quence must contain a partial sequence checkout and

con f irmOrder.

Formula 6 expresses our partial sequence prop-

erty. Let L denote a partial sequence and |L| ≥ 1 that

consists of a series of operation calls hδ

, δ

, . . . , δ

For each δ

∈ L at state i encoded by a control vari-

able cv

ha,ii

, we ask the SMT solver to select this con-

trol variable starting from the initial state (i = 0) until

the length of a partial sequence can no longer ﬁt in the

remaining states. This encodes all possible ways that

Verifying OCL Operational Contracts via SMT-based Synthesising

253

a partial sequence L could happen within a sequence.

We then make sure at least one of them is chosen by

the solver.

i=0

|L|

∈L

ha,ii

= 1, where m = k − |L| + 1. (6)

For example, we let L = hcheckout, con f irmOrderi

be a partial sequence and generate a call sequence

that must contain L. Each blue shaded area in Fig-

ure 7 represents a way of selecting this partial se-

quence by enabling the corresponding control vari-

ables. In addition, Figure 7 shows that there is a total

of 5 ways of synthesising this partial sequence L. Se-

lecting any one of the ﬁve ways satisﬁes the partial

sequence property. Therefore, the disjunction in For-

mula 6 makes sure that at least one of the possible

subsequences is selected.

Figure 7: There are 5 possible ways of synthesising a call

sequence (a length of 6) that contains a partial sequence

consisting of checkout and con f irmOrder for the running

example in Figure 1. Here, each x represents a control vari-

able.

6 FORMULA SIMPLIFICATIONS

Removing Invariants.We notice that in many cases

a class invariant may not affect the features used by

the postconditions of an operation. In other words, a

class invariant may not depend on the features that a

postcondition depends on. In this case, we can safely

remove this class invariant from the model. The gen-

eral rule here is to construct two dependency graphs

for postconditions (G

) and invariants (G

). If a fea-

ture f that appears in G

does not appear in G

, then

we can remove all the class invariants that depend on

this f

. Otherwise, we reduce each class invariant to

a single quantiﬁed formula.

For example, the class invariant in Figure 1 de-

pends on the attribute id in the membership class.

However, id does not appear in the dependency graph

in Figure 3. Therefore, we can remove this class in-

variant from the synthesis constraints.

Removed invariants can be reasoned by using a sepa-

rate procedure.

Implicit Invariants. The set of pre/postconditions

may impose constraints on a set of common fea-

tures in a model. Unrolling each of them produces

similar-structured formulas and therefore increases

the overall formula size. For example, it is implicit

that each operation call in Figure 1 uses one com-

mon constraint: the number of items contained in a

shopping cart must not be negative. In fact, this con-

straint is an implicit class invariant. When users dis-

cover this constraint via the analysis of synthesised

sequence, users may elevate this constraint to a class

invariant and introduce a ∀ quantiﬁer. Hence, we in-

troduce the following simpliﬁcation rule:

j=0

(o, j) ≡ ∀o : T, s : INT. F

(o, s) (7)

Formula 7 implies that if there exists a constraint (F

)

on an object (o) over bounded states, then this con-

straint can be reduced to a single quantiﬁed one.

Symmetric Relations. Relations between two

classes are modelled as associations in a UML class

diagram. However, many relations are symmetric

and naively unrolling formulas over these relations

doubles the overall formula size. For example, a mar-

riage relationship between two people is symmetric

since marriage(person0, person1) is the same as

marriage(person1, person0). Thus, we can halve

the formula size by unrolling only one of them. We

now provide a simpliﬁcation rule (Formula 8) for

expressing such symmetric relations over unbounded

states.



j=0

Rel(o

, o

, j)



∧



j=0

Rel(o

, o

, j)



≡



∀o

: T

, o

: T

, s : INT. Rel(o

, o

, s)

= Rel(o

, o

, s)



(8)

Formula 8 uses ∀ to quantify a (binary) symmetric

relation Rel

. With this rule, we now can unroll one

of the Rel(o1, o2, j) and therefore reduce the overall

formula size.

For an n-ary relation, it can be decomposed into multi-

ple binary relations.

MODELSWARD 2020 - 8th International Conference on Model-Driven Engineering and Software Development

254

7 IMPLEMENTATION AND

EVALUATION

We have implemented a prototype that generates syn-

thesis and properties constraints for a given input

model. Currently, this prototype is semi-automatic.

We ﬁrst generate a smaller template for a given model.

The template includes information such as the num-

ber of the objects and the length of a sequence to be

synthesised. Then, our tool uses a formula reason-

ing engine from MaxUSE for instantiating those tem-

plates to produce the concrete synthesis and property

constraints (Wu, 2017a; Wu, 2017b). By default, our

tool generates SMT2 standard formulas and uses the

Z3 SMT solver for constraint solving (De Moura and

Bjørner, 2008).

In order to evaluate the scalability of our tech-

nique, we collect 4 models from recent literature and

one from the example shown in Figure 1 to form a

benchmark. Since our prototype is semi-automatic,

we ﬁrst generate a smaller template for each model in

the benchmark and manually verify the correctness of

these formulas. We then scale them into much longer

sequences.

7.1 Evaluation

We evaluate our technique on an Intel(R) Xeon(R)

machine with eight 3.2GHz cores and 16 GB memory.

However, our evaluation only uses one single core.

The evaluation is divided into two phases. First, we

evaluate the synthesis constraints with property con-

straints with respect to pre/post conditions. Second,

we evaluate the effectiveness of our simpliﬁcation

rules by applying appropriate rules on each model.

We now here to discuss our evaluation results. All

of our generated results can be found at:

http://www.cs.nuim.ie/

∼

haowu/synseq.html

Results. The rows in non-gray colour in Table 1

shows the results of our evaluation on synthesis and

property constraints. The ‘Time’ and ‘Size’ columns

show the time spent (the unit here is second) on syn-

thesising sequences and number of formulas gener-

ated, respectively. In general, our technique is able to

synthesise large sequences quite efﬁciently. For each

model, our technique can handle a number of objects

and pre/post conditions. The most challenging model

in this benchmark is the Bank model. This is because

this model contains quite a lot of numeric constraints

and this imposes a great challenge to the SMT solver.

To determine how effective our simpliﬁcation rules

are, we apply suitable simpliﬁcation rules on each

model. For example, we use full reachability (FR)

property constraint and removing invariants (RI) sim-

pliﬁcation rule to discover a counter-example (for the

Tra f f icLights model) that a pedestrian light and a car

light are both in a dead state (cannot progress with fur-

ther operation when k ≥ 4). The rows in gray colour

in Table 1 shows the difference after applying suitable

simpliﬁcation rules. The results here show that by us-

ing simpliﬁcation rules in Section 6 can effectively re-

duce overall formula size and increases performance

in most of the cases. Interestingly, though the sym-

metric relation rule can halve the formula size, it does

not always increase performance. This phenomenon

is observed when we set k ≥ 250 for Marriage model.

We surmise that when use symmetric relation (SR)

rule it introduces additional quantiﬁers and this may

cause the solver to spend quite amount of time on in-

stantiating those quantiﬁers.

Comparison. Comparing to existing approaches

(Soeken et al., 2011b; Przigoda et al., 2015), our tech-

nique can synthesise much larger sequences. Here,

we compare our technique against bit-vector based

approaches (Soeken et al., 2011a; Soeken et al.,

2011b; Przigoda et al., 2015). In general, it is dif-

ﬁcult to compare performance. This is because: (1)

The performance typically depends on speciﬁc mod-

els, properties (deadlock, reachability), and the indi-

vidual SMT solver used. (2) It is extremely difﬁcult to

reimplement other’s formulas due to the lack of pub-

licly accessible data (Soeken et al., 2011a; Soeken

et al., 2011b; Przigoda et al., 2015). In order to per-

form a fair performance comparison, we compare our

approach against bit-vector based approaches on the

benchmark but select the bank model as a representa-

tive model here

. In general, our comparison results

reveal that the bit-vector based approaches have good

performance only when the size of bit-vectors are rel-

atively small.

To conduct a meaningful experiment, we reim-

plement the bit-vector based approaches presented in

(Soeken et al., 2011b; Przigoda et al., 2015) for the

bank model (Przigoda et al., 2015) and set up three

groups

. Each group uses different sized bit-vectors:

8,16 and 32. Hence, each account in three groups has

an upper bound of 255,65535 and 4294967295, re-

spectively. For each group, we then instantiate 3 bank

accounts, constrain the initial and ﬁnal state within a

sequence. We ask SMT solvers to synthesise a se-

Due to the page limit, we use bank model as our repre-

sentative model to explain the comparison results.

Though the work in (Przigoda et al., 2015) checks the

concurrent behaviours of a system, the fundamental encod-

ing is the same as the work in (Soeken et al., 2011b).

Verifying OCL Operational Contracts via SMT-based Synthesising

255

Table 1: The evaluation results of synthesis and property constraints with simpliﬁcation rules for models collected in the

literature: Company(Gogolla et al., 2007), TrafﬁcLights(Soeken et al., 2011b), Bank(Przigoda et al., 2015) and Marriage

(Gogolla et al., 2017). The time unit here is seconds. The row in gray colour shows the improvements in both performance

and formula size after applying simpliﬁcation rules. CB, FR, CEnT and PS denotes called-before, full reachability, called

exactly n times and partial sequence properties respectively. RI, II and SR denotes removing invariants, implicit invariants

and symmetric relations simpliﬁcation rules respectively.

Length

Company TrafﬁcLights Bank Marriage Onlineshop

FR, RI FR, RI CEnT, RI CEnT, SR PS, CB, RI, II

Time Size Time Size Time Size Time Size Time Size

k = 100

0.47 2302 0.91 912 10.83 2499 4.48 7549 0.09 1314

0.24 2201 0.42 810 8.40 1599 2.29 4305 0.09 1013

k = 150

3.56 3450 4.63 1362 18.12 3748 10.30 11575 1.24 1963

3.50 3302 3.46 1210 14.72 2398 7.51 5994 1.12 1509

k = 200

5.50 4603 14.72 1810 80.56 5203 18.39 15288 4.56 2611

4.48 4402 7.22 1597 45.72 3401 8.51 8598 3.22 2247

k = 250

6.57 5752 21.87 2262 154.58 6428 19.53 21840 5.45 3011

5.86 5500 15.61 2010 151.61 3998 20.21 9994 5.70 2509

k = 300

7.71 6901 32.06 2704 167.93 7498 25.83 21849 6.30 3613

7.38 6704 23.70 2408 109.74 4798 26.39 11994 6.21 3008

quence with a length (k) from 10 up to 50 (with an

interval of 10). We use Boolector for the bit-vector

based approaches. Boolector is a specially crafted

SMT solver for bit-level reasoning (Niemetz et al.,

2015). We then use Z3 SMT solver to solve the syn-

thesis constraints generated using our technique.

Table 2 shows the performance comparison results

on the bank model. It can be seen that bit-vector based

encoding works quite well with the width of 8. How-

ever, when the size of each bit-vector increases the

performance signiﬁcantly decreases. This is because

the solver needs to create a boolean variable for each

bit in a bit-vector (bit-blasting), before it performs a

series of arithmetic operations for each bit. This is

ﬁne for a bit-vector that has a smaller ﬁxed width. In

Table 2, the bit-vector based encoding (with 16 and

32 bit width) can only synthesise a call sequence of

length up to 20. On the other hand, our technique al-

lows much better scaling. This is much more realistic

as in the real world many data structures work with

32 bit long integers. For example, a bank can set up

an account using a 32 bit integer (or even larger) rep-

resenting the amount of money that an account can

hold.

7.2 Discussion

Compared to existing approaches, our evaluation re-

sults show that our synthesis and property constraints

can be solved efﬁciently at a much larger scale

(Soeken et al., 2011b; Przigoda et al., 2016; Przigoda

et al., 2015). Our simpliﬁcation rules also effectively

reduce the overall formula size. In comparison to bit-

vector based approaches, the trade off here is among

three aspects: expressiveness, ﬂexibility and perfor-

mance. Therefore, we compare our technique to bit-

vector based approaches in these three aspects. Table

3 shows detailed comparisons in different criterion.

In general, encoding a model along with OCL opera-

tional contracts into quantiﬁer-free bit-vectors should

possess high performance during the SMT solving.

However, this is not always the case. In particu-

lar, when the model is required to be encoded into

larger size bit-vectors or involves complicated arith-

metic computations, the performance decreases sig-

niﬁcantly due to bit-blasting. Our technique provides

a better solution to balance the three aspects by us-

ing the ﬁrst-order encoding. This allows quantiﬁers

to be used for class invariants, queries and triggered

system states. This helps to reduce the number of for-

mulas unrolled in each state and provides much more

expressiveness, ﬂexibility and in the meanwhile main-

tains high performance.

Automation. Though our technique is semi-

automatic, we provide a set of python scripts that

is able to process a formula template generated by

MaxUSE and instantiate it with concrete synthesis

and property constraints. The amount of user inter-

vention here is quite little. However, adding extra

constraints for a particular state may require users to

manually insert the formula into a template. We are

now building a tool that is able to automatically insert

the formula at appropriate place in a template.

OCL. Our technique currently supports a range of

OCL language constructs. These include: constraints

on attributes, different operations over collection data

MODELSWARD 2020 - 8th International Conference on Model-Driven Engineering and Software Development

256

Table 2: The performance comparison between bit-vector based approaches and our technique. The time unit here is seconds

and TO indicates timed out. BV(8), BV(16) and BV(32) indicates each bit vector has a width of 8, 16 and 32 bits respectively.

The timed out setting here is 180 seconds for both Boolector and Z3 SMT solvers.

Approach k = 10 k = 20 k = 30 k = 40 k = 50

BV(8) 0.41 1.22 3.25 4.67 6.32

BV(16) 0.54 10.7 TO TO TO

BV(32) 0.93 37.6 TO TO TO

Our Technique 0.13 0.54 0.62 3.58 5.67

Table 3: The criterion for comparing our technique (ﬁrst-order encoding) against bit-vector based approaches. E, F and P here

denote three aspects: expressiveness, ﬂexibility and performance.

Criterion Bit-vector First-order

Data Types (E) Bounded Unbounded

Class Invariants (E) Unrolled in each state Quantiﬁed

Query (F) No direct support Quantiﬁed states.

Property (F) Unrolled with synthesis

constraints.

Separated from synthesis

constraints

Formula Size (P) Compact Reduced

Solving Time (P) Fast on small size

bit-vectors

Fast ﬁrst-order reasoning

Underlying Solver (P) BV solver SMT Solver

types and quantiﬁed class invariants. However, the

models collected in the benchmark (Table 1) do not

cover full OCL features. Hence, there is a gap be-

tween covering full OCL language features and our

technique. Further, OCL is a 4-valued logic language

that includes undeﬁned and invalid values. Currently,

we do not support encodings for undeﬁned and invalid

values. However, we are investigating a new encoding

that allows us to support multi-valued logic.

Our Findings. We have two main ﬁndings: (1) Our

synthesis and property constraints possess high ex-

pressiveness and ﬂexibility. This signiﬁcantly im-

proves possibilities of debugging or analysing differ-

ent sequences by using quantiﬁed ﬁrst-order forms

without regenerating synthesis constraints for each

state. (2) Our simpliﬁcation rules are effective for re-

ducing and boosting performance. This also shows

users a general template for further expanding or cre-

ating more customised rules.

Limitations. We identify two limitations. (1) Our

property constraints currently works on bounded

states. Hence, when the length of a sequence in-

creases the regeneration of the property constraints is

inevitable. Though it is possible to introduce quanti-

ﬁed forms for the property constraints, rewriting the

property constraints unrolled in each state to ﬁrst-

order form may require a more sophisticated encoding

for the synthesis constraints. (2) Our simpliﬁcation

rules in general reduce overall formula size. However,

the introduced quantiﬁers may lead to quantiﬁer alter-

nations (If the formula is already quantiﬁed). This

typically places a great challenge on current SMT

solvers. One of the ways to tackle this is to use a

specialised decision procedure designed for synthesis

problems only (Reynolds et al., 2017).

Threats to Validity. There are two threats to valid-

ity in our evaluation. (1) Our benchmark is formed

from existing approaches covering a great deal of

OCL constructs including: navigation, nested quan-

tiﬁed invariants and queries or operations over collec-

tion data types. However, they do not cover all aspects

of OCL constructs such as closure operator. (2) The

comparison results may not be precise enough. This is

because we reimplement bit-vector based approaches

based on interpretations of published articles rather

than the actual concrete formulas due to the lack of

publicly accessible data.

8 RELATED WORK

SMT solving techniques (Przigoda et al., 2015;

Soeken et al., 2011b) and ﬁlmstripping (Gogolla

et al., 2014; Hilken and Gogolla, 2016; Gogolla et al.,

2017) are two major approaches for generating (syn-

thesising) call sequences from OCL operational con-

tracts. In general, the two approaches complement

each other and a detailed comparison has been con-

ducted in (Hilken et al., 2014). The ﬁlmstripping ap-

Verifying OCL Operational Contracts via SMT-based Synthesising

257

proach translates the source model into so-called ﬁlm-

strip models. These models essentially are UML class

diagrams annotated with OCL constraints. These con-

straints are frame conditions that specify the changes

to the model. One of the major advantages using ﬁlm-

strip is that the low level (SAT solver) is not explic-

itly exposed to users. Thus, it does not require users

to have the knowledge at the solver level. However,

the cost here is the substantial manual interaction at

model level. Therefore, the ﬁlmstripping approach

is suitable for very dedicated tasks. In comparison,

our technique presented here is more general and it

focuses on increasing expressiveness, ﬂexibility and

maintaining high performance through a novel ﬁrst-

order encoding.

Constraint programming (CP) is another popular

approach to verifying dynamic aspects of UML mod-

els (Cabot et al., 2009). Typically, CP provides a high-

level programming language so that a particular prob-

lem can be programmed into a constraint satisfaction

problem (CSP) (Cabot et al., 2009; Gonz

alez P

erez

et al., 2012; Cabot et al., 2014). In (Cabot et al.,

2009), they program the veriﬁcation tasks into a CSP

that is solved later using constraint solvers. In their

work, they propose a range of different properties

to be checked such as weak and strong satisfaction.

Their work mainly focus on generating proofs rather

than synthesising call sequences. Our work distin-

guishes from theirs by proposing a ﬁrst-order reason-

ing technique so that we can synthesise call sequences

at large scale via SMT solving.

Alloy as a model ﬁnding tool (Jackson, 2002; Tor-

lak and Jackson, 2007), is popularly used in many do-

mains including verifying UML models. However,

the majority of the work uses Alloy as its basis fo-

cuses on verifying/solving structural constraints of a

system (Anastasakis et al., 2007; Shah et al., 2009;

Garis et al., 2011; Kuhlmann and Gogolla, 2012).

For example, Kyriakos et al. maps a range of OCL

constructs to Alloy’s speciﬁcation (Anastasakis et al.,

2007), and Kuhlmann et, al. integrates kodkod (Al-

loy’s solving engine) into the USE modelling tool and

this enables them to be able to verify and analyse

UML models annotated with different types of OCL

constraints.

Other approaches have also sought to translate

UML and OCL into different types of formalisms.

These include interactive theorem provers such as Is-

abelle and KeY (Ahrendt et al., 2007; Brucker and

Wolff, 2009; Balaban and Maraee, 2013; Dania and

Clavel, 2013; Dania and Clavel, 2016). For example,

Brucker et, al. translate OCL into high-order logic

and prove them using Isabelle (Brucker and Wolff,

2008). Others formalises UML models into PVS

(Kyas et al., 2005). Compare to SAT/SMT solving

(Wu et al., 2013; Wu, 2016), interactive based ap-

proaches allow users to input/deﬁne their own theo-

rems or axioms to guide the solver. This could be

particularly helpful if the solvers are stuck and cannot

progress during the prove.

9 CONCLUSION

In this paper, we present a new SMT-based technique

that allows us to synthesise call sequences at a much

larger scale than previously possible. This technique

uses ﬁrst-order form based synthesis constraints. To

enable property-based synthesis with reduced formula

size, we have designed a set of property constraints

and simpliﬁcation rules. Our evaluation results show

that in comparison to existing approaches, our tech-

nique provides a much more general solution for syn-

thesising call sequences in terms of scalability, ex-

pressiveness, ﬂexibility and performance. Currently,

we are investigating new algorithms and techniques

so that both alternative quantiﬁed synthesis and prop-

erty constraints can be efﬁciently reasoned.

REFERENCES

Ahrendt, W., Beckert, B., H

ahnle, R., and Schmitt, P. H.

(2007). Key: A formal method for object-oriented

systems. In Formal Methods for Open Object-Based

Distributed Systems, pages 32–43. Springer Berlin

Heidelberg.

Anastasakis, K., Bordbar, B., Georg, G., and Ray, I. (2007).

UML2Alloy: A challenging model transformation. In

ACM/IEEE 10th International Conference on Model

Driven Engineering Languages and Systems, pages

436–450, Nashville, TN. Springer.

Balaban, M. and Maraee, A. (2013). Finite Satisﬁability of

UML Class Diagrams with Constrained Class Hierar-

chy. ACM Transcation on Software Engineering and

Methodology, 22(3):24:1–24:42.

Berardi, D., Calvanese, D., and De Giacomo, G. (2005).

Reasoning on UML class diagrams. Artiﬁcial Intelli-

gence, 168(1-2):70–118.

Brucker, A. D. and Wolff, B. (2008). HOL-OCL: A for-

mal proof environment for UML/OCL. In 11th Inter-

national Conference on Fundamental Approaches to

Software Engineering, pages 97–100. Springer.

Brucker, A. D. and Wolff, B. (2009). Semantics, calculi,

and analysis for object-oriented speciﬁcations. Acta

Informatica, 46(4):255–284.

uttner, F., Egea, M., and Cabot, J. (2012). On verify-

ing ATL transformations using ‘off-the-shelf’ SMT

solvers. In 15th International Conference on Model

Driven Engineering Languages and Systems, pages

432–448.

MODELSWARD 2020 - 8th International Conference on Model-Driven Engineering and Software Development

258

Cabot, J., Claris

o, R., and Riera, D. (2009). Verifying

UML/OCL operation contracts. In 7th International

Conference on Integrated Formal Methods, pages 40–

55, D

usseldorf, Germany. Springer.

Cabot, J., Claris

o, R., and Riera, D. (2014). On the veri-

ﬁcation of UML/OCL class diagrams using constraint

programming. Journal of Systems and Software, 93:1–

23.

Dania, C. and Clavel, M. (2013). Ocl2fol+: Coping with

undeﬁnedness. In OCL@MoDELS.

Dania, C. and Clavel, M. (2016). Ocl2msfol: A mapping to

many-sorted ﬁrst-order logic for efﬁciently checking

the satisﬁability of ocl constraints. In 19th Interna-

tional Conference on Model Driven Engineering Lan-

guages and Systems, pages 65–75. ACM.

De Moura, L. and Bjørner, N. (2008). Z3: an efﬁcient SMT

solver. In 14th International Conference on Tools and

Algorithms for the Construction and Analysis of Sys-

tems, pages 337–340, Budapest, Hungary. Springer.

Garis, A., Cunha, A., and Riesco, D. (2011). Translating Al-

loy Speciﬁcations to UML Class Diagrams Annotated

with OCL. In 9th International Conference on Soft-

ware Engineering and Formal Methods, pages 221–

236, Montevideo, Uruguay. Springer.

Gogolla, M., B

uttner, F., and Richters, M. (2007). USE: A

UML-based speciﬁcation environment for validating

UML and OCL. Science of Computer Programming,

69(1-3):27–34.

Gogolla, M., Hamann, L., Hilken, F., Kuhlmann, M., and

France, R. B. (2014). From application models to

ﬁlmstrip models: An approach to automatic validation

of model dynamics. In Modellierung.

Gogolla, M., Hilken, F., Doan, K., and Desai, N. (2017).

Checking UML and OCL model behavior with ﬁlm-

stripping and classifying terms. In 11th International

Conference on Tests & Proofs, pages 119–128.

Gonz

alez P

erez, C. A., Buettner, F., Claris

o, R., and Cabot,

J. (2012). EMFtoCSP: A tool for the lightweight ver-

iﬁcation of EMF models. In Formal Methods in Soft-

ware Engineering: Rigorous and Agile Approaches,

Zurich, Suisse.

Hilken, F. and Gogolla, M. (2016). Verifying linear tempo-

ral logic properties in UML/OCL class diagrams using

ﬁlmstripping. In 2016 Euromicro Conference on Dig-

ital System Design, pages 708–713.

Hilken, F., Niemann, P., Gogolla, M., and Wille, R. (2014).

Filmstripping and unrolling: A comparison of veriﬁ-

cation approaches for uml and ocl behavioral models.

In Tests and Proofs, pages 99–116. Springer Interna-

tional Publishing.

Jackson, D. (2002). Alloy: a lightweight object modelling

notation. ACM Transactions on Software Engineering

Methodologies, 11(2):256–290.

Kuhlmann, M. and Gogolla, M. (2012). From uml and ocl to

relational logic and back. In 15th International Con-

ference on Model Driven Engineering Languages and

Systems, pages 415–431. Springer.

Kyas, M., Fecher, H., de Boer, F. S., Jacob, J., Hooman,

J., van der Zwaag, M., Arons, T., and Kugler, H.

(2005). Formalizing UML models and OCL con-

straints in PVS. Electronic Notes in Theoretical Com-

puter Science, 115:39–47.

Niemann, P., Hilken, F., Gogolla, M., and Wille, R. (2015).

Extracting frame conditions from operation contracts.

In 18th International Conference on Model Driven

Engineering Languages and Systems, pages 266–275.

Niemetz, A., Preiner, M., and Biere, A. (2015). Boolec-

tor 2.0 system description. Journal on Satisﬁability,

Boolean Modeling and Computation, 9:53–58.

Przigoda, N., Hilken, C., Wille, R., Peleska, J., and Drech-

sler, R. (2015). Checking concurrent behavior in

uml/ocl models. In 18th International Conference on

Model Driven Engineering Languages and Systems

(MODELS), pages 176–185.

Przigoda, N., Soeken, M., Wille, R., and Drechsler,

R. (2016). Verifying the structure and behav-

ior in uml/ocl models using satisﬁability solvers.

IET Cyber-Physical Systems: Theory Applications,

1(1):49–59.

Reynolds, A., Kuncak, V., Tinelli, C., Barrett, C., and De-

ters, M. (2017). Refutation-based synthesis in smt.

Formal Methods in System Design.

Shah, S. M. A., Anastasakis, K., and Bordbar, B. (2009).

From UML to alloy and back again. In 6th Interna-

tional Workshop on Model-Driven Engineering, Veri-

ﬁcation and Validation, pages 4:1–4:10. ACM.

Soeken, M., Wille, R., and Drechsler, R. (2011a). En-

coding OCL data types for SAT-based veriﬁcation of

UML/OCL models. In 5th International Conference

on Tests and Proofs, pages 152–170, Zurich, Switzer-

land. Springer.

Soeken, M., Wille, R., and Drechsler, R. (2011b). Verifying

dynamic aspects of uml models. In Design, Automa-

tion Test in Europe, pages 1–6.

Torlak, E. and Jackson, D. (2007). Kodkod: a rela-

tional model ﬁnder. In 13th International Confer-

ence on Tools and Algorithms for the Construction

and Analysis of Systems, pages 632–647, Braga, Por-

tugal. Springer.

Wille, R., Soeken, M., and Drechsler, R. (2012). Debug-

ging of inconsistent UML/OCL models. In 2012 De-

sign, Automation Test in Europe Conference Exhibi-

tion, pages 1078–1083.

Wu, H. (2016). Generating metamodel instances satisfying

coverage criteria via SMT solving. In The 4th Interna-

tional Conference on Model-Driven Engineering and

Software Development, pages 40–51.

Wu, H. (2017a). Finding achievable features and constraint

conﬂicts for inconsistent metamodels. In 13th Euro-

pean Conference on Modelling Foundations and Ap-

plications, pages 179–196. Springer.

Wu, H. (2017b). Maxuse: A tool for ﬁnding achievable

constraints and conﬂicts for inconsistent UML class

diagrams. In Integrated Formal Methods, pages 348–

356. Springer.

Wu, H. (2019). Synthesising call sequences from OCL op-

erational contracts. In The 34th ACM/SIGAPP Sym-

posium on Applied Computing.

Wu, H., Monahan, R., and Power, J. F. (2013). Exploit-

ing attributed type graphs to generate metamodel in-

stances using an SMT solver. In 7th International

Symposium on Theoretical Aspects of Software Engi-

neering, Birmingham, UK.

Verifying OCL Operational Contracts via SMT-based Synthesising

259