Using Model Scoping with Expected Model Elements to Support
Software Model Inspections: Results of a Controlled Experiment
Carlos Gracioli Neto
1,4
, Amadeu Anderlin Neto
2,5
, Marcos Kalinowski
2
,
Daniel Cardoso Moraes de Oliveira
1
, Marta Sabou
3
, Dietmar Winkler
3
and Stefan Biffl
3
1
Computing Institute, Federal Fluminense University, Niterói - RJ, Brazil
2
Department of Informatics, Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Rio de Janeiro - RJ, Brazil
3
Christian Doppler Laboratory for Securty and Quality Improvement in the Production System Lifecycle (CDL-SQI),
Information Systems Engineering, Vienna University of Technology, Vienna, Austria
4
Federal Institute of Education, Science and Technology of Mato Grosso, Rondonópolis - MT, Brazil
5
Federal Institute of Education, Science and Technology of Amazonas, Manaus - AM, Brazil
marta.sabou@ifs.tuwien.ac.at, {dietmar.winkler, stefan.biffl}@tuwien.ac.at
Keywords: Model Inspection, Model Quality Assurance, Model Scoping, Empirical Study.
Abstract: Context: Software inspection represents an effective way to identify defects in early phase software artifacts,
such as models. Unfortunately, large models and associated reference documents cannot be thoroughly in-
spected in one inspection session of typically up to two hours. Considerably longer sessions have shown a
much lower defect detection efficiency due to cognitive fatigue. Goal: The goal of this paper is to propose
and evaluate a Model Scoping approach to allow inspecting specific parts of interest in large models. Method:
First, we designed the approach, which involves identifying Expected Model Elements (EMEs) in selected
parts of the reference document and then using these EMEs to scope the model (i.e., remove unrelated parts).
These EMEs can also be used to support inspectors during defect detection. We conducted a controlled ex-
periment using industrial artifacts. Subjects were asked to conduct UML class diagram inspections based on
selected parts of functional specifications. In the experimental treatment, Model Scoping was applied and
inspectors were provided with the scoped model and the EMEs. The control group used the original model
directly, without EMEs. We measured the inspectors’ defect detection effectiveness and efficiency and col-
lected qualitative data on the perceived complexity. Results: Applying Model Scoping prior to the inspection
significantly increased the inspector defect detection effectiveness and efficiency, with large effect sizes.
Qualitative data allowed observing a perception of reduced complexity during the inspection. Conclusion:
Being able to effectively and efficiently inspect large models against selected parts of reference documents is
a practical need, in particular in the context of incremental and agile process models. The experiment showed
promising results for supporting such inspections using the proposed Model Scoping approach.
1 INTRODUCTION
Software engineering models represent abstractions
for different aspects of a software system (e.g., struc-
ture, behaviour, or interaction). The quality of such
models can be of key importance for completing pro-
jects successfully (Lange and Chaudron, 2005).
Hence, the verification of models prior to the creation
of software is of particular relevance for high-quality
information systems analysis.
Verifying the correct representation of domain
concepts in software models requires human
knowledge of the domain. Software inspection meth-
ods (Thelin et al., 2003; Travassos et al., 1999) have
been found effective to detect defects in requirements
and software models in empirical studies (El-
berzhager et al., 2012).
Software model inspection typically requires
checking whether a conceptual model correctly and
completely represents the content of suitable refer-
ence documents, such as systems specifications. In
practice, models representing abstractions of large en-
terprise information systems tend to be large as well
(e.g., UML class diagrams for large information sys-
Neto, C., Neto, A., Kalinowski, M., Moraes de Oliveira, D., Sabou, M., Winkler, D. and Biffl, S.
Using Model Scoping with Expected Model Elements to Support Software Model Inspections: Results of a Controlled Experiment.
DOI: 10.5220/0007691001070118
In Proceedings of the 21st International Conference on Enterprise Information Systems (ICEIS 2019), pages 107-118
ISBN: 978-989-758-372-8
Copyright
c
2019 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
107
tems may have hundreds of domain classes). Unfor-
tunately, model inspection studies have only focused
on the inspection of small-to-medium sized models so
far.
Hence, an important question is how to address
cases where large models need to be inspected against
their associated reference documents, i.e., involving
inspection materials beyond the size that an inspector
can cover within the limitations of a traditional one-
pass inspection process (Laitenberger and DeBaud,
2000), of typically up to two hours.
The strategy investigated in this paper to tackle
this problem involves scoping the model for selected
parts of the reference documents. It is noteworthy that
nowadays software is typically developed following
iterative or agile processes (Theocharis et al., 2015),
where new specifications (e.g., user stories or use
cases and their descriptions) are added incrementally.
Hence, being able to effectively and efficiently in-
spect large models against selected (or incremental)
parts of reference documents is a practical need.
We introduce the model scope concept as a well-
defined model part that acts as a filter or view show-
ing only relevant model elements. Our proposed
Model Scoping with Expected Model Elements ap-
proach consists of identifying Expected Model Ele-
ments (EMEs) in the selected parts of the reference
document and then using these EMEs to: (a) scope the
model (remove unrelated parts) and (b) guide the in-
spectors during defect detection. The idea of identify-
ing EMEs within reference documents has been used
to allow supporting inspection with crowdsourcing
(Winkler et al., 2017). In this paper, we investigate
using the EMEs for model scoping.
We conducted a controlled experiment with stu-
dents using real industrial artifacts (UML class dia-
grams and selected parts of functional specifications)
aiming to understand how Model Scoping would in-
fluence the model inspection effectiveness and effi-
ciency. Subjects were asked to conduct UML class di-
agram inspections based on selected parts of func-
tional specifications for two different modules. In the
experimental treatment Model Scoping with EMEs
was applied and inspectors were provided with the
EMEs and the scoped model. The control group used
the original model directly, without EMEs.
Applying Model Scoping with EMEs prior to in-
spection significantly increased the inspector defect
detection effectiveness and efficiency. While this pa-
per presents the first reference-document-based
Model Scoping approach and it positively influenced
model inspection effectiveness and efficiency, apply-
ing it properly requires some effort and being able to
correctly identify EMEs in selected parts of the refer-
ence document.
The remainder of this paper is organized as fol-
lows. Section 2 presents the background and related
work. Section 3 describes the Model Scoping ap-
proach. Sections 4 and 5 present the experimental
study and its results. Section 6 discusses the results.
Section 7 concerns the threats to validity. Finally,
Section 8 concludes and identifies future work.
2 SOFTWARE MODEL
INSPECTIONS
Software inspection (Fagan, 1976) is a well-estab-
lished formal defect detection approach that enables
efficient defect detection in early software develop-
ment phases, e.g., during software design.
Over the years, some research has been reported
regarding model inspections. Travassos et al. (1999),
for instance, introduced a reading technique for in-
specting object-oriented UML structure and behav-
iour models regarding the consistency between mod-
els and reference information. Sabaliauskaite et al.
(2002; 2003; 2004) conducted controlled experiments
with UML documents to compare and evaluate the ef-
fectiveness and efficiency of reading techniques with
different levels of guidance. Thelin et al. (2003) in-
troduced usage-based reading to guide inspectors by
first prioritizing business scenarios and then checking
whether a design model correctly represented the in-
formation of the most important business scenarios.
These studies have focused on a complete scope
of work, typically involving small-to-medium sized
models and their related reference documents, which
an average inspector can address in two hours to mit-
igate risks from fatigue. However, the studies did not
consider how to address larger inspection objects
(e.g., larger models and/or larger reference docu-
ments). Therefore, it is unclear to what extent their
findings hold for (parts of) larger software artifacts.
Winkler et al. (2017a; 2017b; 2018) investigated dis-
tributing the effort of model inspection on a group of
inspectors by using crowdsourcing. While inspectors
had to focus on specific parts of a model, model scop-
ing was not directly considered in these investiga-
tions.
The idea of using model scoping to support soft-
ware inspections was also explored by Briand et al.
(2014). Their results show a significant decrease in
effort and an increase in decisions correctness when
models are filtered prior to inspection. However, they
focused on a particular problem, extracting design
ICEIS 2019 - 21st International Conference on Enterprise Information Systems
108
slices (model fragments) to support safety inspection.
In contrast, the approach proposed in this paper, de-
tailed in the next section, is generic and not restricted
to a particular type of requirements.
3 MODEL SCOPING WITH EMES
The core idea of the approach Model Scoping with
EMEs is to define a model scope based on a selected
part of the reference documents (e.g., a selected part
of a large functional specification). The scoping is
conducted based on Expected Model Elements
(EMEs). The EMEs for the selected part of the refer-
ence documents should be identified and used to: (a)
scope the model (remove parts that are not related to
the EMEs) and (b) guide the inspectors during defect
detection.
To help identifying the EMEs in the reference
documents, guidelines commonly applied when de-
signing the model could be used. For instance, Lar-
man (2004) presents guidelines for identifying clas-
ses, attributes, operations and relationships for UML
class diagrams based on requirements. Alternatively,
Sabou et al. (2018) argue on the feasibility of using
expert sourcing for such purpose. While such alterna-
tives could be applied, for simplification purposes, in
the scope of this paper, we consider that Model Scop-
ing with EMEs is applied by a specific role, which
typically could be conducted by someone with skills
similar to the ones required for identifying such ele-
ments when designing conceptual models (e.g., a re-
quirements analyst).
Figure 1 outlines the context in which the Model
Scoping with EMEs approach is applied. Inputs are
the selected part of the reference documents and the
model to be reviewed, outputs are the list of EMEs
and the scoped model.
Figure 1: Model Scoping with EMEs approach.
Model Scoping with EMEs itself is conducted by
following 3 steps:
1. Based on the type of model to be inspected, define
the types of EMEs to be identified (e.g., for UML
class diagrams, EME types are classes, attributes,
operations and relationships; for UML state dia-
grams, EME types are states and transition
events).
2. Read the selected part of the reference documents
and identify the list of EMEs (existing guidelines
for identifying EMEs may be used to support this
step).
3. Scope the model by removing model elements
that are not in the list of EMEs. While doing so, to
avoid removing relevant contextual (closely re-
lated) information, the following elements should
not be removed: (a) elements that represent rela-
tionships among elements included in the list of
EMEs (e.g., an association between classes in an
UML class diagram or a transition between two
states in an UML state diagram); and (b) elements
that are contained in an element included in the
list of EMEs (e.g., attributes of a class in an UML
class diagram).
During defect detection (not part of the Model
Scoping with EMEs approach), inspectors can use the
list of EMEs and the reference documents to verify
whether the EMEs are correctly represented in the
scoped model as foundation for reporting defects.
4 EXPERIMENT DESCRIPTION
Aiming to understand how Model Scoping with EMEs
would influence the model inspection effectiveness
and efficiency, we designed and conducted an exper-
imental study. The goal, planning (i.e., context, vari-
ables, hypotheses, subject selection, design, and in-
strumentation), and operation of the experiment are
detailed in the following subsections.
4.1 Experiment Goal
The experiment goal was defined based on the GQM
(Goal-Question-Metric) template (van Solingen et al.,
2002). Table 1 presents the experiment goal.
It is noteworthy that, while the approach is in-
tended to be generic (i.e., applicable to any type of
model), we instantiated the experiment using UML
class diagrams to be inspected with respect to func-
tional information system specifications.
Based on our goal, we derived the following re-
search question: What is the impact of Model Scoping
with EMEs on inspection effectiveness and effi-
ciency? The variables used to answer this research
question are described in detail in Subsection 4.3.
Using Model Scoping with Expected Model Elements to Support Software Model Inspections: Results of a Controlled Experiment
109
Table 1: Experiment goal.
Analyze
the inspection of UML class diagrams
using Model Scoping with EMEs
f
or the purpose o
f
characterization
with respect to
inspection effectiveness & efficiency
f
rom the point of
view of
t
he information systems researche
r
in the context o
f
UML class diagram inspection based
on a valid functional specification,
conducted by novice inspectors, when
compared to not using Model Scoping
with EMEs.
4.2 Experiment Context
The experiment was conducted in two undergraduate
class room trials, representing exact internal replica-
tions, involving students enrolled in Software Engi-
neering classes at the Fluminense Federal University.
These students were asked to review UML class dia-
grams based on correct functional specifications.
In order to make the context more representative,
we selected artifacts from a real industrial software
project. The project concerned the development of an
integrated management system, with several mod-
ules. We selected the functional specification and
class diagrams of two modules, one module concern-
ing simpler administrative functions and the other
module more complex financial billing services. Each
functional specification contained an overview de-
scription, a list of functional requirements, use case
diagrams, and use case descriptions. It is noteworthy
that the specifications had been reviewed by profes-
sionals and validated by industrial stakeholders.
For the experiment, we selected excerpts of each
functional specification related to specific use cases,
which were good representatives of use cases to be
implemented in a next development cycle and against
which the model should be verified before implemen-
tation. The excerpt of the administrative module com-
prised the contextual information (i.e., overview,
functional requirements, use case diagram, and use
case descriptions) for four small use cases, while the
excerpt of the financial billing module comprised the
contextual information for two more complex use
cases. These specifications were assumed to be cor-
rect. We seeded each class diagram with 28 artificial
defects related to the respective functional specifica-
tion excerpts. For distributing the types of defects, we
considered the defect taxonomy proposed by Shull
(1998), containing the types ambiguity, incon-
sistency, incorrect fact, omission, and extraneous in-
formation. We seeded 7 defects of each type (except
inconsistency, given that each task involved inspect-
ing a single model that could therefore not be incon-
sistent with others). The seeding also considered in-
cluding defects of different difficulty levels (easy,
medium, and hard) for each type.
The first author applied the Model Scoping with
EMEs activity, building the lists of EMEs and the
scoped models for both class diagrams using the func-
tional specifications excerpts. The third author re-
viewed this activity.
4.3 Variables Selection
The independent variable in the inspection experi-
ment is the treatment applied by the groups in order
to find defects in the UML model. While both groups
used an ad-hoc inspection technique (i.e., no specific
reading technique), the experimental group received
the Defect Taxonomy, the list of EMEs, and the
scoped model, and the control group received the De-
fect Taxonomy and the full model.
Regarding dependent variables, we used in the in-
spection experiment effectiveness and efficiency, de-
fined as follows:
Effectiveness is the ratio between the number of
real defects found and the total number of known
defects.
Efficiency is the ratio between the number of real
defects found and the time spent.
Besides measuring these variables, we collected
qualitative feedback in a follow-up questionnaire, in-
spired by the Technology Acceptance Model (TAM)
questionnaire (Davis, 1989) regarding the perceived
usefulness, ease of use, and behavioral intention of
adoption. TAM provides proper theoretical constructs
and has been extensively used and validated for this
purpose (Turner et al., 2010).
4.4 Hypotheses
Using the variables described in the previous subsec-
tion, we defined the following null hypotheses:
H
01
– there is no difference in the effectiveness
when inspecting UML class diagrams with or
without using Model Scoping with EMEs.
H
02
– there is no difference in the efficiency when
inspecting UML class diagrams with or without
using Model Scoping with EMEs.
4.5 Selection of Subjects
Subjects were intended to represent novice inspec-
tors, to avoid specific knowledge on inspection tech-
niques as a significant confounding factor. We se-
lected students from two undergraduate classes on
ICEIS 2019 - 21st International Conference on Enterprise Information Systems
110
Software Engineering at the Fluminense Federal Uni-
versity, involving 44 daily shift and 10 nightly shift
students.
All students filled in a characterization form with
objective questions to inform us about their expertise
in the topics related to the study: (a) their experience
with software development; (b) their experience in
object-oriented modeling (UML); and (c) their expe-
rience in software inspection. We collected the in-
spector characterization form from each student and
ranked it into: none (N), low (L), medium (M), me-
dium-high (MH), and high (H) experience for each
expertise topic.
For instance, regarding experience with software
development, a subject was characterized as having:
(N) no experience, if s/he never had contact with soft-
ware development; (L) low experience, if s/he had
contact with software development only in class; (M)
medium experience if s/he had contact with program-
ming in an academic project; (MH) medium-high ex-
perience, if s/he had contact with programming in an
industry project; or (H) high experience, if s/he had
industrial experience involving multiple projects.
Likewise, the experience in software inspection and
in object-oriented modeling (UML) were assigned.
After characterizing the participant’s experience,
aiming to mitigate threats to validity concerning the
distribution of subjects between the groups, we ap-
plied the principles of balancing, blocking, and ran-
dom assignment (Wohlin et al., 2012). For balancing,
we attempted to create groups of equal size. However,
due to some absences on the day of the experiment
execution, one group had a larger number of mem-
bers. Concerning blocking, we avoided having one
team with more experienced subjects than the other.
Finally, subjects of equal experience were randomly
assigned to the groups.
The experiment comprised two exercises applied
on two days (one exercise per day). Students who par-
ticipated in only one exercise were removed from the
analysis. In addition, participants, who found less
than 10% of the defects were discarded as outliers be-
cause their results were understood as those of stu-
dents with difficulty understanding the task or who
did not engage in the activity for some other reason.
Thus, out of 44 participants in the first trial, the data
of 32 participants was considered for the data analy-
sis. Regarding the second trial, out of 10 participants,
the data of 8 participants was considered.
Table 2 (first trial subjects) and Table 3 (second
trial subjects) present the results of the subjects’ char-
acterization and the group division. To facilitate un-
derstanding of the blocking, we highlighted the most
experienced subjects in the tables with grey-tone
background filling.
4.6 Experiment Design
The experiment design is a one-factor design with
two treatments (ad-hoc with or without Model Scop-
ing with EMEs) and two different tasks (artifact in-
spection exercises, study objects). We adopted a
cross-over design to mitigate threats to validity of the
experiment concerning: (i) differences among exper-
imental tasks (the influence of the provided exercise
matierials in the results); and (ii) the learning effect
(the influence of the order in which the treatments are
applied on the outcomes). It is noteworthy that the
principles applied to distribute the subjects between
the groups still enable comparing the results for each
individual exercise. The cross-over design is shown
in Figure 2.
Table 2: Expertise per participant in the first trial.
Group ID
Software De-
velo
p
ment
UML
Models
Software
Ins
p
ection
1
P1 MH M M
P2 L MH L
P3 M MH M
P4 MH H L
P5 M M L
P6 L M L
P7 MH L L
P8 H H M
P9 MH M L
P10 L M L
P11 H H L
P12 L M L
P13 L MH M
P14 L L L
P15 M M M
P16 MH M L
P17 MH MH M
P18 M L L
2
P19 H MH L
P20 L MH L
P21 L L L
P22 MH MH L
P23 MH MH L
P24 MH MH M
P25 L L M
P26 L L L
P27 L L L
P28 L L L
P29 MH MH L
P30 M M MH
P31 MH M L
P32 MH L L
Using Model Scoping with Expected Model Elements to Support Software Model Inspections: Results of a Controlled Experiment
111
Table 3: Expertise per participant in the second trial.
Group ID
Software De-
velopment
UML
Models
Software
Inspection
1
P33 H H L
P34 L H L
P35 M M L
P36 H H M
2
P37 H H M
P38 H H L
P39 L H L
P40 M H L
Figure 2: Cross-over experiment design.
4.7 Instrumentation
The instruments used in this experiment were: con-
sent and characterization forms; training material on
model inspection; task description, with instructions
to perform the inspection, including the defect taxon-
omy; a defect reporting form; an excerpt of a real in-
dustrial functional specification (the overview of the
system module to be inspected, the functional re-
quirements, the use case diagrams and their descrip-
tions); and an industrial UML class diagram with de-
fects seeded by the authors. When Model Scoping
with EMEs was applied, instead of the UML class di-
agram, subjects received the scoped UML class dia-
gram and the list of EMEs.
The scoped models were prepared by applying
Model Scoping with EMEs on the defect seeded UML
class diagrams based on the functional specification
excerpts. For the administrative module, researchers
identified a list of EMEs in the contextual information
(i.e., overview, functional requirements, use case di-
agram, and use case descriptions) for four small use
cases (maintaining company data, maintaining cus-
tomer data, maintaining tax information, and main-
taining cost centers). Then, using this list of EMEs,
researchers scoped the UML class diagram by apply-
ing Model Scoping with EMEs as described in Section
3. This process reduced the domain class diagram for
this module from 19 to 12 classes.
Similarly, for the billing module, based on the two
(more complex) selected use cases (registering in-
voices for provided services and registering payments
for invoices), the class diagram was reduced from 22
to 16 classes. Figure 3 shows the cut-outs performed
during the scoping for the billing module. These cut-
outs are related to entities concerning the physical
emission of invoices (in communication with the fed-
eral invoice emission system) and of other receipts,
which are detailed in two other use cases that were
not part of the selection. Thus, these model parts
would only add complexity to the inspection, as they
were not related to the selection against which the
model should be verified.
It is noteworthy that, while the scoping allowed
cutting out some irrelevant parts for the inspection
scope, many classes still remained, given that the se-
lected use cases were very central for each module. It
is noteworthy that the researchers did not bias the pro-
cess, as these artifacts were used as provided from the
industrial partner. The excerpt of the functional spec-
ification for the inspection task was planned for an
inspection of up to 75 minutes (industrial inspections
are recommended to take no longer than 2 hours). It
is noteworthy that if one of these modules even would
have had hundreds of use cases and hundreds of do-
main classes, the cutout for the selected use cases
would still select the same relevant amount of classes,
related to the EMEs contained in the excerpt.
Figure 3: Cut-outs during Model Scoping (dashed rectan-
gles) for the billing module.
4.8 Experiment Operation
The experiment was conducted on three days. On the
first day, the participants answered the characteriza-
tion form in order to allow dividing them into exper-
iment groups. Prior to the execution of the experi-
ment, on the second day, basic concepts of the dia-
grams and relevant types of defects were reviewed
with participants in a training session of 15 minutes.
After that, the inspection was conducted as follows.
On the first round (Exercise A), Group 1 inspected
the full UML class diagram with 19 classes, based on
the requirements document using the ad-hoc treat-
ment. On the other hand, the participants of Group 2
(Model Scoping with EMEs treatment) inspected the
ICEIS 2019 - 21st International Conference on Enterprise Information Systems
112
scoped UML class diagram with 12 classes. Besides
the exact same functional specification excerpt, the
subjects of Group 2 used the list of EMEs. All subjects
had to report the defects found in the diagram. In to-
tal, the inspectors had 75 minutes to perform the in-
spection, including reporting the defects found. It is
important to mention that communication between
participants was not allowed. After the inspection, the
participants answered the follow-up questionnaire.
On the last day, the procedures conducted for Ex-
ercise A (administrative module) were repeated for
Exercise B (billing module). However, the group that
was previously assigned to the ad-hoc treatment was
now assigned to the Model Scoping treatment and
vice-versa. We used the exact same procedures and
instrumentation for both trials. In the next section, we
present the results of the experiment.
5 RESULTS
This section reports on the analysis of quantitative
and qualitative data collected in the experiment.
5.1 Quantitative Analysis
The data analyst obtained the quantitative data from
the defect reporting form, resulting from the experi-
mental task. The data analyst counted the number of
real defects found, false positives, and time used by
each subject. Following the design presented in Fig-
ure 2, Table 4 presents both the results per subject and
the overall result per treatment of the Exercise A and
Exercise B, respectively.
The data analyst performed the statistical analyses
using the statistical tool SPSS v 25.0.0. For hypothe-
sis testing, the data analyst used the Mann-Whitney
non-parametrical test with α = 0.10. This statistical
significance level has been found acceptable for soft-
ware engineering experiments, which typically in-
volve small sample sizes (Dybå et al., 2006). In addi-
tion, to increase our sample size, we decided to ag-
gregate the results of both trials, which does not ex-
pose us to additional threats to validity, given that the
trials were exact internal replications.
Figure 4 shows descriptive results as boxplots that
compare the effectiveness indicator of both exercises
(A and B). It is possible to observe that the median
for the Model Scoping with EMEs treatment in both
exercises (A median 0.41, A mean 0.41, B median
0.50, B mean 0.48) is higher than the median for the
ad-hoc treatment (A median 0.32, A mean 0.36, B
median 0.28, B mean 0.29). The effect sizes (stand-
ardized mean difference between two populations)
for exercises A and B are respectively 0.47 (medium)
and 1.57 (very large). Thus, the group that used
Model Scoping with EMEs was more effective than
the control group. Also, using the Mann-Whitney test
showed a significant difference between the groups (p
= 0.075 for Exercise A and p = 0.001 for Exercise B).
Figure 4: Effectiveness (defects found / seeded defects) in-
dicator for defect detection.
These results suggest that the use of Model Scop-
ing with EMEs allowed the inspectors to achieve
higher defect detection effectiveness. These results
allow rejecting the null hypothesis H
01
.
Figure 5 shows boxplots comparing the efficiency
of the treatments. For efficiency it is also possible to
observe that the median of the Model Scoping treat-
ment in both exercises (A median 10.64, A mean
10.60, B median 11.05, B mean 11.53) is higher than
the median of the ad-hoc treatment (A median 7.71,
A mean 8.58, B median 7.67, B mean 7.43). The ef-
fect sizes for exercises A and B are respectively 0.65
(medium-to-large) and 1.30 (very large).
Figure 5: Efficiency (defects found per hour) indicator for
defect detection.
Using Model Scoping with Expected Model Elements to Support Software Model Inspections: Results of a Controlled Experiment
113
Table 4: Quantitative results per subject and treatment.
Exercise A
Exercise B
ID
Duration
(min)
Defects
found
False
positives
Effect. Effic.
Duration
(min)
Defects
found
False
positives
Effect. Effic.
Group 1
P1 71 8 4 29% 6.76 73 13 14 46% 10.68
P2 72 10 4 36% 8.33 75 13 4 46% 10.40
P3 72 15 8 54% 12.50 75 17 7 61% 13.60
P4 69 13 6 46% 11.30 73 14 15 50% 11.51
P5 68 7 11 25% 6.18 66 12 0 43% 10.91
P6 71 9 4 32% 7.61 78 12 14 43% 9.23
P7 68 10 6 36% 8.82 75 14 5 50% 11.20
P8 66 7 4 25% 6.36 72 11 6 39% 9.17
P9 70 9 5 32% 7.71 75 15 10 54% 12.00
P10 69 10 4 36% 8.70 75 12 0 43% 9.60
P11 72 17 9 61% 14.17 76 17 5 61% 13.42
P12 70 9 12 32% 7.71 70 14 7 50% 12.00
P13 72 6 13 21% 5.00 63 15 5 54% 14.29
P14 69 4 8 14% 3.48 72 13 11 46% 10.83
P15 70 12 9 43% 10.29 75 18 23 64% 14.40
P16 72 7 7 25% 5.83 75 12 3 43% 9.60
P17 69 15 13 54% 13.04 73 13 20 46% 10.68
P18 69 11 9 39% 9.57 70 15 12 54% 12.86
P33 68 18 4 64% 15.88 75 16 18 57% 12.80
P34 75 9 7 32% 7.20 75 13 4 46% 10.40
P35 71 8 4 29% 6.76 75 12 3 43% 9.60
P36 75 7 13 25% 5.60 75 18 15 64% 14.40
Group 2
P19 60 11 4 39% 11.00 50 7 11 25% 8.40
P20 67 12 4 43% 10.75 63 4 0 14% 3.81
P21 51 14 5 50% 16.47 50 8 12 29% 9.60
P22 71 9 3 32% 7.61 66 10 2 36% 9.09
P23 72 16 10 57% 13.33 73 12 8 43% 9.86
P24 72 9 9 32% 7.50 75 8 4 29% 6.40
P25 71 11 6 39% 9.30 66 4 14 14% 3.64
P26 57 10 4 36% 10.53 74 4 7 14% 3.24
P27 63 15 10 54% 14.29 75 12 3 43% 9.60
P28 60 9 3 32% 9.00 60 10 5 36% 10.00
P29 67 9 15 32% 8.06 52 13 6 46% 15.00
P30 72 6 14 21% 5.00 75 7 4 25% 5.60
P31 61 12 4 43% 11.80 65 6 5 21% 5.54
P32 62 14 7 50% 13.55 55 9 6 32% 9.82
P37 75 16 2 57% 12.80 75 8 2 29% 6.40
P38 68 13 4 46% 11.47 75 3 17 11% 2.40
P39 75 12 3 43% 9.60 59 8 18 29% 8.14
P40 75 11 0 39% 8.80 75 9 3 32% 7.20
ICEIS 2019 - 21st International Conference on Enterprise Information Systems
114
Thus, the group that used the Model Scoping with
EMEs was more efficient when compared to the con-
trol group. Again, the Mann-Whitney test shows the
difference between the groups to be statistically sig-
nificant (p = 0.024 for Exercise A and p = 0.001 for
Exercise B). Hence, using Model Scoping with EMEs
allowed inspectors to achieve higher efficiency.
These results allow rejecting the null hypothesis H
02
.
5.2 Qualitative Analysis
The data analyst collected qualitative data from the
follow up questionnaires. The participants rated their
perception on the complexity of the task (as very
complex, complex, simple, or very simple), and noted
difficulties they had during the experiment. For par-
ticipants using the Model Scoping with EMEs treat-
ment (which received a list of EMEs to support the
inspection), participants filled in the TAM question-
naire with a five point Likert scale (completely disa-
gree, disagree, neutral, agree, and completely agree)
for the questions on perceived usefulness, ease of use,
and intention of adoption.
Regarding the perceived complexity of the task,
shown in Table 5, there was a subtle difference be-
tween the treatments. While overall the tasks were
considered complex by the participants, which was
expected as they were handling real industrial arti-
facts, the amount of participants that perceived the
complexity as simple was larger for the Model Scop-
ing with EMEs treatment. In particular, it is easy to
observe that Exercise B (the billing system, which
had more complex business rules) was perceived as
more complex by the participants when using the ad-
hoc treatment. Thus, the Model Scoping and the
EMEs seemed to have reduced the perceived com-
plexity. For instance, participant P34 mentioned "dif-
ficulties to understand the description of use cases
and then compare against the class diagram" and
suggested that "the diagram could be […] smaller".
Participant P1 mentioned that "browsing through the
numerous descriptions when searching defects is a
complex task […], some kind of guidance is needed".
The TAM questionnaire was applied only with the
Model Scoping with EMEs treatment. In general, the
inspectors perceived receiving the EMEs to support
inspection as useful, easy to use and would like to re-
ceive such support. Note that the participants did not
know that they were inspecting a scoped model. In-
deed, all participants agreed or completely agreed
with the TAM questions. Moreover, we noticed that
many inspectors, who used the Model Scoping with
EMEs treatment in the first exercise, mentioned that
Table 5: Complexity of the experimental tasks as perceived
by participants.
Treatment Mod.
Complexity
Very
Sim-
le
Sim-
ple
Com-
plex
Very
Com-
p
lex
Ad-hoc
Adm. 0% 5% 90% 5%
Bill-
ing
0% 12.5% 50% 37.5%
Model Scop-
ing
Adm. 0% 15% 75% 10%
Bill-
ing
0% 22.5% 72.5 5%
they missed this kind of support in the second exer-
cise. For instance, participant P32 mentioned that the
exercise contained "a lot of information" and that it
was "difficult to understand what to do without a
guide". He also requested "the extra page (EMEs) of
the previous exercise". Participant P28 mentioned
difficulties in "identifying defects in relationships be-
tween classes" and reported that "the use of EMEs"
could make the task more enjoyable.
6 DISCUSSION
In this section we discuss the results of the quantita-
tive and qualitative analyses of experiment data.
The main goal of the quantitative analysis was in-
vestigating how the Model Scoping with EMEs ap-
proach would affect the effectiveness and efficiency
of inspectors when reviewing models based on refer-
ence documents. We selected real industrial class di-
agrams and their related functional specifications as
reference documents for two modules of an integrated
management system. As input for the experiment, we
selected a set of use cases (and their contextual infor-
mation) for each module and conducted the Model
Scoping with EMEs activity as described in Section 3.
The experiment results indicate that applying the
proposed Model Scoping with EMEs approach before
the inspection improved both effectiveness and effi-
ciency of the inspectors when reviewing the UML
class diagrams against the functional specification ex-
cerpts. Moreover, the results were statistically signif-
icant and had large effect sizes.
As a complement, the qualitative analysis indi-
cated that inspectors perceive their inspection tasks as
less complex if Model Scoping with EMEs was ap-
plied before the inspection (i.e., they inspect a scoped
model and receive a list of EMEs). They also per-
ceived it useful to receive candidate EMEs to support
the verification of the model against the reference
documents.
Using Model Scoping with Expected Model Elements to Support Software Model Inspections: Results of a Controlled Experiment
115
These results indicate applying Model Scoping
with EMEs before inspections in situations where
large UML class diagrams are to be inspected against
excerpts (or increments) of functional specifications.
This is in line with the results by Briand et al. (2014),
who also observed effort reductions when scoping
models for their specific purposes (addressing safety
requirements). While we assume Model Scoping with
EMEs applicable for any kind of model, we limit the
advice and recommendations to the findings of our
specific experimental setting.
It is also noteworthy that properly applying Model
Scoping with EMEs requires some effort and being
able to correctly identify EMEs in selected parts of
the reference document. The first author took 1 hour
to identify the EMEs within each functional specifi-
cation excerpt and then used the EMEs to scope the
model. Still, we believe that the effort is worthwhile
in many cases (in particular for large models) consid-
ering that inspection teams usually involve three to
five inspectors who would perceive their task as less
complex and would be more effective and efficient.
Approaches for identifying expected model elements
in natural text have been investigated (Sabou et al.,
2018) and could be used in this context.
7 THREATS TO VALIDITY
In this section, we present and discuss the threats to
the validity of the controlled experiment, organized
by the categories described by Wohlin et al. (2012).
7.1 Internal Validity
The tasks of the experiment were performed by the
participants individually and under the supervision of
one of the researchers. Communication among the
participants was not allowed.
After characterizing the participants' experience,
the principles of balancing, blocking and random as-
signment (Wohlin et al., 2012) were applied to miti-
gate threats to validity regarding the distribution of
subjects between groups.
7.2 External Validity
Regarding artifact representativeness, we used real
industrial UML class diagrams and functional speci-
fications. Still, they represent artifacts from one spe-
cific organization. For subject representativeness, we
used students to represent novice inspectors. Using
students for this purpose is a valid abstraction, in par-
ticular considering that they have been properly char-
acterized (Falessi et al., 2018).
Still, artifacts are from one specific organization
and subjects come from one specific context. There-
fore, there are no claims of external validity through-
out the paper and the study validity is specific to the
context in which it was performed. Indeed, we call for
replications involving a variety of artifacts (e.g., mod-
els and reference documents) and contexts to rein-
force the experimental evidence.
7.3 Construct Validity
Regarding the treatment, the experiment task in-
volved inspections using real industrial artifacts. We
employed a cross-over design to isolate confounding
factors related to the experimental tasks (i.e., the ex-
ercises) and the learning effect. In addition, it is im-
portant to note that the metrics used to measure effec-
tiveness and efficiency have been widely used in em-
pirical studies on software inspection.
A potential threat to the construct validity regards
the defects seeded in the diagrams. To mitigate this
threat, the defects were distributed in a harmonic way
as to their types, as well as the levels of difficulty of
detection.
7.4 Conclusion Validity
To improve conclusion validity, we aggregated the
subjects from both trials to increase the sample size
for data analysis. This was possible given that we had
two exact internal replications following the same
cross-over design. Outliers were carefully removed to
avoid influence on the results. We applied appropriate
statistical tests and our results were statistically sig-
nificant with large effect sizes. Based on these results,
we are confident that the drawn conclusions are valid
for the reported experimental setting.
8 CONCLUSION AND FUTURE
WORK
In this paper, we introduced the model scope concept
as a well-defined model part that acts as a filter or
view showing only relevant model elements. We pro-
posed and evaluated an approach for Model Scoping
with Expected Model Elements. The approach con-
sists of identifying Expected Model Elements (EMEs)
in the selected parts of the reference document and
ICEIS 2019 - 21st International Conference on Enterprise Information Systems
116
then using these EMEs to scope the model and guide
inspectors during defect detection.
For evaluation, we conducted a controlled experi-
ment with students using real industrial artifacts aim-
ing to understand how Model Scoping with EMEs
would influence the model inspection effectiveness
and efficiency. The experiment results indicate, with
statistical significance and large effect sizes, that ap-
plying Model Scoping with EMEs before the inspec-
tion improved both, effectiveness and efficiency of
the inspectors when reviewing UML class diagrams
against the functional specification excerpts. Addi-
tionally, qualitative data indicated that inspectors per-
ceive their inspection tasks less complex when Model
Scoping with EMEs has been applied before inspec-
tion.
Our takeaway message is that we recommend ap-
plying Model Scoping with EMEs before inspections
in situations where large UML class diagrams are to
be inspected against excerpts (or increments) of func-
tional specifications. Nevertheless, further investiga-
tions to precisely estimate in which cases Model
Scoping with EMEs would be (most) worthwhile the
upfront investment are needed. We call out to the
community for replicating the reported experiment on
Model Scoping with EMEs, including the use with
other diagrams in other contexts, to reinforce experi-
mental evidence and improve external validity.
ACKNOWLEDGEMENT
The financial support by the Christian Doppler Re-
search Association, the Austrian Federal Ministry for
Digital & Economic Affairs and the National Foun-
dation for Research, Technology and Development is
gratefully acknowledged.
REFERENCES
Briand, L., Falessi, D., Nejati, S., Sabetzadeh, M. and Yue,
T., 2014. Traceability and SysML design slices to sup-
port safety inspections: A controlled experiment. ACM
Transactions on Software Engineering and Methodol-
ogy (TOSEM), 23(1): 43p.
Davis, F.D., 1989. Perceived Usefulness, Perceived Ease of
Use, and User Acceptance of Information Technology.
MIS Quarterly, vol. 13.
Dybå, T., Kampenes, V. B., Sjberg, D. I. K., 2006. A sys-
tematic review of statistical power in Software Engi-
neering experiments. Information and Software
Technology 48 (8):745-755.
Elberzhager, F., Münch, J., Nha, V.T.N., 2012. A system-
atic mapping study on the combination of static and dy-
namic quality assurance techniques. In: Information
and Software Technology, 54(1):1-15.
Fagan, M.E., 1976. Design and code inspections to reduce
errors in program development. IBM Systems Journal,
15(7): 182-211.
Falessi, D., Juristo, N., Wohlin, C., Turhan, B., Münch, J.,
Jedlitschka, A. and Oivo, M., 2018. Empirical software
engineering experts on the use of students and profes-
sionals in experiments. Empirical Software Engineer-
ing, 23(1): 452-489.
Laitenberger, O., DeBaud, J.M., 2000. An encompassing
life cycle centric survey of software inspection. In: J. of
Syst. and Software, 50(1):5-31.
Lange, C.F., Chaudron, M.R., 2005. Managing model qual-
ity in UML-based software development. IEEE Interna-
tional Workshop on Software Technology and Engi-
neering Practice, pages 7-16.
Larman, C., 2004. Applying UML and Patterns. 3
rd
Edition,
Prentice Hall.
Sabaliauskaite G., Matsukawa F., Kusumoto S., Inoue K.,
2002. An experimental comparison of checklist-based
reading and perspective-based reading for UML design
document inspection. Int. Symp. on Empirical Software
Engineering, pages 148–157.
Sabaliauskaite G., Matsukawa F., Kusumoto S., Inoue K.,
2003. Further investigations of reading techniques for
object-oriented design inspection. Information and
Software Technology, 45(9): 571–585.
Sabaliauskaite G., Kusumoto S., Inoue K., 2004. Assessing
defect detection performance of interacting teams in ob-
ject-oriented design inspection. Information and Soft-
ware Technology, 46(13): 875–886.
Sabou, M., Winkler, D., Petrovic, S., 2018. Expert Sourcing
to Support the Identification of Model Elements in Sys-
tem Descriptions. In: SWQD 2018, pages 83-99.
Shull, F., 1998. Developing Techniques for Using Software
Documents: A Series of Empirical Studies. Ph.D. the-
sis, University of Maryland, College Park.
Solingen, R. van, Basili, V., Caldiera, G., Rombach H. D.,
2002. Goal Question Metric (GQM) Approach. In: En-
cyclopedia of Software Engineering.
Thelin, T., Runeson, P., Wohlin, C., 2003. An experimental
comparison of usage-based and checklist-based read-
ing. In: TSE, 29(8):687–704.
Theocharis, G., Kuhrmann, M., Münch, J. Diebold, P.,
2015. Is water-scrum-fall reality? on the use of agile
and traditional development practices. International
Conference on Product-Focused Software Process Im-
provement, pages 149-166.
Travassos, G., Shull, F., Fredericks, M., Basili, V., 1999.
Detecting defects in object-oriented designs: Using
reading techniques to increase software quality. In:
Proc. of OOPSLA, pages 47-56.
Turner, M., Kitchenham, B., Brereton, P., 2010. Does the
technology acceptance model predict actual use? A sys-
tematic literature review. Information and Software
Technology, vol. 52: 463-479.
Using Model Scoping with Expected Model Elements to Support Software Model Inspections: Results of a Controlled Experiment
117
Winkler, D., Kalinowski, M., Sabou, M., Petrovic, S., Biffl,
S., 2018. Investigating a Distributed and Scalable
Model Review Process. CLEI Electronic Journal,
21(1): 4:1–4:13.
Winkler, D., Sabou, M., Petrovic, S., Caneiro, G.,
Kalinowski, M., Biffl, S., 2017a. Improving model in-
spection with crowdsourcing In: International Work-
shop on Crowdsourcing in Software Engineering,
ICSE, Buenos Aires, Argentina, pages 30-34.
Winkler, D., Sabou, M., Petrovic, S., Carneiro, G.,
Kalinowski, M., Biffl, S., 2017b. Improving Model In-
spection Processes with Crowdsourcing: Findings from
a Controlled Experiment. In EuroSPI, pages 125–137.
Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C.,
Regnell, B., Wesslén, A., 2012. Experimentation in
Software Engineering. Berlin, Heidelberg: Springer
Berlin Heidelberg.
ICEIS 2019 - 21st International Conference on Enterprise Information Systems
118