Glaserian Systematic Mapping Study: An Integrating Methodology

Gustavo Navas

1,2 a

and Agust

ın Yag

1 b

Escuela T

ecnica Superior de Ingenier

ıa de Sistemas Inform

aticos ETSISI, Universidad Polit

ecnica de Madrid (UPM),

Calle Alan Turing s/n, Ctra. de Valencia, Km. 7, Madrid, Spain

Universidad Polit

ecnica Salesiana, Mor

an Valverde S/N y Rumichaca, Quito, Ecuador

Keywords:

Glaserian Grounded Theory, Systematic Mapping Study, Qualitative Analysis, Software Development.

Abstract:

This research arises as an answer to the limited classiﬁcation capability that reaches the vast majority of

selected articles within a Systematic Mapping Study (SMS) when studying the Grounded Theory (GT) in

Software Development. The result of our research is Glaserian Systematic Mapping Study (GSMS). It is a

methodology that combines SMS and Glaserian Grounded Theory (GGT), which is one of the two variants

of the GT. Combining the robustness and sequential process of SMS with GGT and its iterative features,

GSMS provides a more robust, ﬂexible, iterative, and scalable methodology. SMS and GGT share two main

activities, data collection and data analysis. However, they are conducted differently. The resulted integration

takes advantage of this fact and maps both related activities and outcomes to produce a more robust and

systematic methodology. In addition, our research formalizes equations to represent the typical data saturation

of qualitative methods such as GGT. With GSMS, we were able to classify more articles than with SMS alone.

1 INTRODUCTION

When a Systematic Mapping Study (SMS) is applied,

the results may not be sufﬁciently complete. Some-

times, the results are not signiﬁcant enough because

it is not always possible to fully classify the source

data when document analysis is conducted. This low

ranking ability could be the main reason why SMS

fails, and some articles that could not be classiﬁed are

discarded. Making an SMS with a higher percentage

of coverage would imply a heavy preliminary phase

of code deﬁnition and document classiﬁcation. Clas-

siﬁcation is essential because the mapping results are

not signiﬁcant enough with a low volume of items.

This fact has been veriﬁed by us when conduct-

ing a SMS to study the application of Grounded The-

ory (GT) in Software Development. Our motivation

arises because when conducting the SMS, the results

obtained were not as promising as expected. In that

initial work, we got similar results to other previ-

ous reviews of the literature on GT in software en-

gineering were the classiﬁcation rate of articles did

not cover more than 55% of total number of papers,

and according to (Stol et al., 2016), this is due to an

inadequate GT application. Their arguments did not

https://orcid.org/0000-0002-2811-0282

https://orcid.org/0000-0002-4761-0901

convince us because some of the discarded works had

signiﬁcant contributions, well-established theories on

the subject they dealt with, and essential ﬁndings in

software engineering and grounded theory. Our goal

was to understand better the causes of this low level

of classiﬁcation rate. We changed the focus to look

for a way to increase the classiﬁcation capabilities of

SMS without heavy coding processes.

We proposed an integration between SMS and GT

to increase the coverage rate, providing a more ro-

bust classiﬁcation mechanism that complements the

rigour of systematic mapping studies with the ﬂexi-

bility of grounded theory through iterations. Our ap-

proach is based on Glaserian Grounded Theory(GGT)

that starts data analysis without any preconceived no-

tion. In our research, we started analyzing 70 articles

and in the end, we were able to have a high classiﬁ-

cation rate and obtained theories about how to apply

GT in software engineering.

The structure of this paper is as follows, Section 2

is the background and Section 3 describes the guide-

lines applied in the integration process. Section 4 de-

velops the integration to produce what we call Glase-

rian Systematic Mapping Studies (GSMS) and Sec-

tion 5 covers a case of application of the GSMS. Fi-

nally, some conclusions and future work are related.

Navas, G. and Yagüe, A.

Glaserian Systematic Mapping Study: An Integrating Methodology.

DOI: 10.5220/0011090500003176

In Proceedings of the 17th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2022), pages 519-527

ISBN: 978-989-758-568-5; ISSN: 2184-4895

519

2 BACKGROUND

This section presents a brief description of the SMS

and GT. The word ”step” is being used when referring

to SMS processes, ”stage” when referring to GT, and

the word “phase” in the proposed methodology.

2.1 Systematic Mapping Study (SMS)

Systematic mapping is mainly used in medicine but

is increasing its relevance in the research of Software

Engineering (Petersen et al., 2015). The Systematic

Mapping Study (SMS) is a rigorous review process

of the scientiﬁc literature. SMS establishes a well-

deﬁned methodology that allows mapping scientiﬁc

articles (Kitchenham et al., 2011) and reduces the bias

of people’s opinions. As proposed by Petersen et al.,

there are six steps for the SMS processes: Step 1: Re-

search questions deﬁnition: see Figure 1(a) P1. Step

2 Conduct Search: see Fig. 1(a) P2. Step 3: Screen-

ing of papers: see Figure 1(a) P3. Step 4: Key-

wording: see Figure 1(a) P4. Step 5: Mapping: see

Fig. 1(a) P5. Step 6: Synthesis: see Figure 1(a) P6.

Moreover, we have also included Rigour and rele-

vance assessment Figure 1(a) P7 as an additional step

of SMS proposed by Paternoster (Paternoster et al.,

2014).

Figure 1(a) depicts SMS steps and their outcomes.

On one side, P1, P2, and P3 share the same goal: col-

lecting data and selecting the scientiﬁc articles to be

analyzed. On the other side, P4, P5, and P6 deal with

the analysis and mapping of results and ﬁnally, P7 is

an approach to evaluate the scientiﬁc quality.

2.2 Glaserian Grounded Theory

Grounded Theory is a qualitative research method-

ology proposed by Glaser and Strauss in 1965 and

consolidated in 1972 (Glaser and Strauss, 1973).

GT is a methodology that generates a substantive

theory about the topics under research, their con-

cepts, and categories through constant and system-

atic comparison of data during the process. GT has

evolved into two variants: Glaserian Grounded The-

ory (GGT)(Glaser, 1992) and Straussian Grounded

Theory (SGT)(Van Niekerk and Roode, 2009).

The main difference between GGT and SGT is

based on the role of researches and the starting point.

GGT emphasizes an open attitude where theories do

not come from researchers’ preconceptions; however,

they come from the data. In the case of SGT, the

researcher must apply a set of tools and procedures

having an active role to use existing insights and ex-

perience during the research(Strauss, A. and Corbin,

1990). These differences can be summarized as fol-

lows GGT is independent of the researcher’s ideas,

while the researcher’s views inﬂuence SGT. Within

GT variants, behavior is the way of facing a prob-

lem or concern in an area of study. GGT’s behavior

is given by the generation of concepts and relation-

ships that explain and interpret its variation in an area

of study. On the other hand, SGT describes the full

range of behaviors (Sharma and Biswas, 2015).

Figure 1: GGT and SMS integration.

GGT is the grounded theory approach selected

due to, as it was stated by Stray et al.(Stray et al.,

2016), the ease arising of research questions during

data analysis, allowing concepts and categories to

emerge from the data with more ﬂexibility.

According to Glaser, GGT is a research method-

ology with two stages (Glaser, 1992), Data collec-

tion and Data analysis. Later, Adolph (Adolph et al.,

2008) enriched this methodology, including Compar-

ison with the literature as a new stage. These stages

have been depicted as S1, S2, and S3 in Figure 1(b).

GGT comprises three substages: Open Coding,

Figure 1 (b) S2 1. Selective Coding Figure 1 (b)

S2 2. Theoretical Coding Figure 1 (b) S2 3 and

produces two outcomes: Core Category & Emerg-

ing Theory. As previously mentioned, Data anal-

ysis is a data modeling process to discover informa-

tion, extract conclusions, and support hypotheses that

start with data gathered in the previous stage and end

when a theory appears (Parizi et al., 2014). Concern-

ing GGT, it could also be considered as a data analysis

tool. It provides an iterative mechanism to build the-

ories from data conducting.

ENASE 2022 - 17th International Conference on Evaluation of Novel Approaches to Software Engineering

520

3 INTEGRATION DESIGN

The integration of GGT stages and SMS steps in-

creases the ﬂexibility of SMS and systematizes the

stages of GGT. As it was stated by (Finfgeld-Connett,

2014), this is a complex task because there are no pre-

vious recommendations on including GT in system-

atic reviews. The integration done by mapping be-

tween steps and stages has twofold objectives: i) to

identify the relationship between step and stage, and

ii) to determine the expected outcomes of the inte-

grated research methodology.

Both methodologies share two main activities,

“Data collection” and “Data analysis”, but they have

been conducted differently. The proposed integration

of SMS and GGT takes advantage of this fact and

maps both related activities and outcomes to produce

a more robust and systematic methodology. GGT, as a

more open research method, has driven the integration

process. The resulting method has the same struc-

ture as GGT comprising three phases: Data collec-

tion, Data analysis, and Comparison with literature.

The correspondence between SMS phases and GGT

stages is shown in Figure 1. SMS steps are integrated

into GGT stages to be systematically applied to pro-

duce and collect data in GGT that, lately, are analyzed

in the ”Data analysis” stage. Figure 1 shows the re-

lationship between the SMS steps P1, P2, and P3 and

the “GGT Data collection” stage. In the same way,

steps P4, P5, and P6 are integrated into the “GGT data

analysis” stage and, ﬁnally, step P7 into the “GGT

Comparison with literature” stage.

The integration starts mapping the “Data collec-

tion” GGT stage and the SMS steps dealing with the

acquisition of data sources. In traditional SMS, P1

is the ﬁrst step that could be mapped into GGT with

questionnaires or interview elaboration over the topic

under research. The integrated methodology corre-

sponds to the deﬁnition of research questions because

GT is very permissive (Zayour and Hamdar, 2016)

and very diverse in data collection. On the other hand,

some studies use documents to replace interviews and

questionnaires (Adolph et al., 2008). P2 step corre-

sponds to Conduct Search. The integrated methodol-

ogy has the same meaning as in the traditional SMS.

It is equivalent to the writing and debugging process

of the answers when conducting interviews or pass-

ing questionnaires for being analyzed in the conven-

tional GGT. The outcome of this step is the list of ar-

ticles found in the scientiﬁc libraries complying with

the search string. The P3 step is the paper screen-

ing by applying the inclusion and exclusion criteria to

obtain the relevant papers that should be analyzed. In

GSMS, it will be used like in the traditional SMS. The

GGT perspective could be compared with the tran-

scription process and classiﬁcation of interviews and

questionnaires to produce working artifacts. The out-

come of this GSMS phase is the collected data that

will be used as the input for the data analysis phase.

Data analysis is the procedure for conceptualiz-

ing and analyzing data, including identifying rela-

tionships. It is an integration process that increases

the abstraction level that starts based on the results

of the previous phase and ends when the theory ap-

pears (Parizi et al., 2014). The integration between

GGT and SMS in data analysis is feasible because

both share the goal of data modelling to form con-

clusions and formulate the hypothesis. The integra-

tion of “Data analysis” is more complex than ”Data

collection” because GGT allows multiple iterations,

while SMS is sequential. Our proposal has uniﬁed

them to provide an iterative process comprising GGT

stages but adapting their scope and level of abstrac-

tion in terms of the SMS steps. The SMS processes

Keywording, Mapping and Synthesis, shown in Fig-

ure 1 labels P4, P5, P6 comprise Open, Selective and

Theoretical coding stages to produce the outcomes

depicted as O4, O5, and O7 in the same ﬁgure.

Open, selective, and theoretical coding have dif-

ferent meanings depending on the SMS step even

when they are applying the same concept. Therefore,

the outcome of P4 (keywording) comprises several

classiﬁcation schemes (O4) that are used as inputs

in P5 (mapping) to produce relationships between es-

sential elements in terms of concepts, categories and

propositions. These elements are classiﬁed and re-

ﬁned to obtain systematic maps (O5). Maps repre-

sent the inputs to P7 (Synthesis). They are combined

to produce the Core Category, the Emerging Theory

(O7), and could answer some research questions or

identify new ones.

One of the contributions obtained by integrating

GGT and SMS is the possibility to deal with those

new research questions that may arise as an outcome

during this phase. The number of iterations is directly

related to the existence of unresolved questions. Fi-

nally, Rigor and relevance assessment SMS step,

shown in Figure 1 label P6 in the ﬁgure, is only ap-

plied at the end of each iteration and integrated with

the ”Comparison with the literature” stage of the

GGT to produce a paper ranking as the outcome O6

in the ﬁgure.

Glaserian Systematic Mapping Study: An Integrating Methodology

521

4 GLASERIAN SYSTEMATIC

MAPPING STUDY (GSMS)

In brief, Glaserian Systematic Mapping Study

(GSMS) could be summarized as a new approach,

unifying the rigor and mapping of the SMS, with the

ﬂexibility of GGT for building theories. GSMS in-

duces an iterative process in the Data Analysis phase

that allows the construction of theories by promot-

ing a deeper understanding of the results by gener-

ating conceptual propositions and including new re-

search questions when they are discovered. GSMS

applies GGT in two ways: as a qualitative research

methodology and data analysis. As a qualitative re-

search methodology(Glaser, 1992) that encompasses

the SMS process, and second, as a data analysis tool

used by SMS steps to build emerging theories (Van

Niekerk and Roode, 2009), as is shown in Figure 2.

Figure 2: Glaserian Systematic Mapping Study (GSMS).

4.1 GSMS Data Collection Phase

This section describes the GSMS data collection com-

prising three subsections: Research questions deﬁni-

tion, Conduct Search, and Screening of papers.

A. Research Question Deﬁnition Phase. The GSMS

Research question deﬁnition phase is the initial re-

search question or questions. In GSMS, the research

question is recommended to be generic, using in its

formulation the paradigm proposed by GGT, “What

do we have here?” and it also affects its outcome “Re-

view Scope” on the topic under study. Therefore, they

drive the data analysis phase, as illustrated in Figure

2 label P1 with its corresponding outcome Review

Scope Figure 2 O1.

B. Conduct Search Phase. In GSMS, there are sev-

eral activities to perform “Conduct search phase”.

The ﬁrst is to select the initial sources from scientiﬁc

libraries like ISI WoS, IEEE, Scopus, or ACM. The

second activity is to apply a rigorous and systematic

procedure to deﬁne the search string (Petersen et al.,

2015). Through carrying out a series of tests, this

search string must be appropriate to ensure the result

to produce the outcome O2, as is shown in Figure 2

P2 and O2.

The deﬁnition of the search string is critical and

should look for a balance between precision and gen-

erality. If the topic under research is quite general,

the search string should be general, but precise. How-

ever, if the search string is too detailed, some rele-

vant publications could be discarded because of the

inclusion/exclusion criteria. In our case, the research

started with a complex search string like (“Grounded

Theory” & “Software Development” & “require-

ment” & “design”). Our results contained very few

relevant publications due to being too speciﬁc. Later,

we used a simpliﬁed expression with only two terms

(“Software” AND “Grounded Theory”).

C. Screening of Paper Phase. In GSMS, in “Screen-

ing of paper phase”, Figure 2 label P3, a series of in-

clusion and exclusion criteria are applied to the sci-

entiﬁc articles obtained in the previous phase. After

this ﬁrst ﬁltering process, the obtained articles were

passed to a snowballing process to identify additional

ones (Kitchenham et al., 2011).

Within the GSMS, this is the last phase of “Data

collection.” As is shown in Figure 2 O3, two are the

expected outcomes of this phase: i) the list of research

questions and ii) the set of relevant papers represent-

ing the basis for the GSMS data analysis process.

4.2 GSMS Data Analysis Phase

The GSMS data analysis phase incorporates a series

of iterations that starts with the data obtained in the

previous phase and ends when a certain saturation

level is achieved. The saturation level is independent

of the researcher’s criteria, and it should be only based

on the systematic review of the elements produced in

this phase. To consider that an iteration is ﬁnished,

the following conditions must be veriﬁed: i), the iter-

ation has led to one or more new research questions.

ii), the iteration has answered at least one research

question. iii), the iteration has fully answered a spe-

ciﬁc research question. It means that an iteration can

require more than one loop through the correspond-

ing processes before being considered ﬁnished. The

data analysis phase is ﬁnished by achieving the satu-

ration level at the end of an iteration when all research

ENASE 2022 - 17th International Conference on Evaluation of Novel Approaches to Software Engineering

522

Table 1: Outcomes of GSMS Data Analysis.

Keywor.. Mapping Synthesis

Open Concepts Concepts Concepts

Selective Catego- Catego- Catego-

Theoretical Proposi- Proposi- Proposi-

questions have been wholly and thoroughly answered.

This fact has been formulated in Section 4.2.

The three GSMS phases, Keywording, Mapping,

and Synthesis, incorporate open, selective, and theo-

retical coding activities, and they come about in an in-

tegrated and coordinated way through iterations. The

Rigor and relevance assessment phase is conducted at

the end of all iterations. It is essential to highlight

that open, selective, and theoretical coding give rise

to different outcomes that are concepts, categories,

and propositions, respectively. The concepts are ba-

sic ideas emerging from data, i.e., words, keywords,

codes, notes or diagrams. They are used to relate

them, creating categories (Adolph et al., 2008). These

categories could be lists, relationships, or any other

abstract elements based on the previous concepts. Fi-

nally, the propositions connect the concepts and cate-

gories, producing a discursive set of theoretical state-

ments relating to them. They are validated through

constant data comparison (Chun Tie et al., 2019). Ta-

ble 1 presents the type of expected outcomes of each

activity in the GSMS data analysis.

A. GSMS Keywording Phase. The GSMS keyword-

ing phase looks for obtaining a Classiﬁcation Schema,

see Figure 2 label O4. A Classiﬁcation Schema is

a set of elements comprising concepts, categories,

and propositions. Concepts are the result of apply-

ing abstraction during the Keywording Open Cod-

ing. Open coding is driven by the identiﬁed research

questions. While performing the open coding activ-

ity, wording lists, memos, and codes are analyzed in-

depth to identify concepts arising upon active research

questions (Crabtree et al., 2009). The next activity

is Keywording Selective Coding where categories

emerge. Categories represent an upper level of ab-

straction built on top of concepts. The deepening of

the analysis of these concepts gives rise to categories

encompassing them. Categories could also be a list

of relevant concepts. Finally, Keywording Theoret-

ical Coding refers to the highest level of abstraction

to deﬁne propositions to support emerging theories.

Propositions determine theoretical knowledge based

on consolidated statements built on concepts and cat-

egories. Summarizing, the outcomes of this phase are

classiﬁcation schemas for: concepts, categories, and

propositions.

B. GSMS Mapping Phase. The GSMS mapping

starts after the keywording phase with the goal of cre-

ating systematic maps. During the open coding activ-

ity, concepts becoming from keywording are mapped

between them, generating new concepts to support the

mapping. While selective coding is conducted, asso-

ciations among existing categories and relationships

between concepts and categories are identiﬁed to map

new categories. Later, conducting the theoretical cod-

ing activities, new propositions could emerge from

the maps. In this phase, some questions becoming

from the previous iteration could be answered. The

outcomes of this phase are mappings of concepts, cat-

egories, and propositions.

C. GSMS Synthesis Phase. In this GSMS phase,

concepts, categories, and propositions that have arisen

from the previous phases are the basis for the Synthe-

sis Open Coding to produce more abstract and com-

plete concepts. These concepts are deepened and ana-

lyzed for establishing the ﬁnals categories while con-

ducting Synthesis Selective Coding. Finally, with

constant data comparison, the ﬁnal propositions are

obtained in the Synthesis Theoretical Coding. It is

shown in Fig. 2 P7. The outcomes of the GSMS syn-

thesis can be one of these: i) Generation of a new

research question emerging as part of the GSMS pro-

cess. ii) An answer to a research question previously

established, and iii) A fundamental proposition that

will give rise to an emergent theory on a topic in a

later iteration.

Keywording and/or mapping results are incorporated

into the open coding activity of the synthesis. Their

integration is achieved through selective coding, and

then reaches the ﬁnal proposition, that is, theoretical

coding within the synthesis.

D. Formulation of Saturation in GSMS. Saturation

has been modeled using two types of sets. Q is the set

of containing all research questions generated. And,

AQ represents the set of answers to a speciﬁc RQ.

The GSMS Data analysis process ends when the fol-

lowing conditions are met at the same time: i) There

are no new research questions in the iteration, ii) All

research questions have been answered, and iii) No

new elements have been added in the actual iteration

to any of the sets.

These sets could be formalized as follows:

Equation 1. There are no new research questions in

the iteration that could be reformulated in this way,

the next iteration does not generate any new research

questions. It is formulated as:

Glaserian Systematic Mapping Study: An Integrating Methodology

523

Given iteration k;

(

∑

i=1

)

Given iteration k + 1;

k+1

(

∑

j=1

)











⇒

No more questions when:

k+1

−Q

k+1

(1)

Let Q

be the set of research questions at iteration k

and Q

k+1

the set at iteration k + 1.

Equation 2. There is at least an answer to every re-

search question in the set Q(k). It is formulated as:

Let Q =

(

∑

i=1

)

where RQ is a Research Question

∀RQ ∈ Q∃ an iteration k. where is true that

∃A

(

∑

j=1

( j)

)

where AQ

( j) is an answer to RQ











(2)

Equation 3. Saturation is reached when there are no

more answers to each question in the set Q(k). This

condition is formulated as:

Let Q =

∑

i=1

considering ∀RQ ∈ Q

Given the iterations k. It is true that

∃A

(i) =

∑

j=1

& A

k+1

(i) =

∑

j=1

(i) is an answer to RQ in iteration k

k+1

(i) is an answer to RQ in iteration k + 1

It is true that A

(i) = A

k+1

(i)











(3)

Data Analysis Phase Iteration and Loops. GSMS

data analysis is the most complex process in the

methodology. It comprises iterations with well-stated

goals. Iterations are the mechanism to reach the ap-

propriate saturation level in the research. Each iter-

ation receives inputs and produces outputs, and the

output of one iteration is the input of the next, ex-

cept in the ﬁrst one, where the iteration’s input is the

output of “Data collection”. To achieve the expected

goals, iterations could require one or more loops. In

GSMS, we use the term loop to refer the complete

execution of the phases Keywording, Mapping, and

Synthesis. Once the loop is ﬁnished, it is evaluated

whether the goal of the iteration has been achieved

or not. In the case of not, a new loop starts; but in

the case of achievement, the process is moved to the

next phase. Figure 3 (b) shows an example of an it-

eration with one loop. We used activity diagrams to

model iterations, due to the existence of the fork/join

framework that supports the branches that could hap-

pen during the execution of an iteration.

The constant comparison of data and its theoreti-

cal sampling is present in each one of the loops within

the iterations; without them, it would not have been

possible to delve into the process of ﬁnding theo-

ries through propositions. The proposed GSMS es-

tablishes the possibility that questions coming from a

general scope evolve dynamically to be speciﬁc, ab-

stract, and challenging ones.

4.3 Comparison with Literature Phase

The set of answers to the questions arising in the pro-

cess are the input elements for this phase, having the

scientiﬁc rigour and industrial relevance assessment

phase as part of it. It is conducted at the end of each

iteration when one or more answers to research ques-

tions were provided. There are some relevant consid-

erations about this phase: First, It allows reviewing

tertiary articles, thus establishing a difference with

the primary articles obtained in data collection. It

looks for to conﬁrm or deny the ﬁndings of the itera-

tion through other works around the subject of study.

Second, the answered questions must provide a set

of articles to compare with the arising emerging the-

ories. Third, GGT establishes that should not be pre-

established validation criteria at the beginning of the

research; however, at the end of each iteration it is the

time to give rise to answer a question and the ﬁnd-

ings, that will be compared and validated, to look for

similarities and differences.

A. Rigor and Relevance Assessment Phase. For the

elaboration of the rubrics for scientiﬁc rigor and rele-

vance industrial, we took the recommendations given

by (Ivarsson and Gorschek, 2010). The rigor and rel-

evance were applied to the answers of each research

question through the coding activities.

5 CASE STUDY

This section provides an example of the application of

GSMS to study the use of grounded theory in software

development. It comprises two sections; the ﬁrst ex-

plains how the data collection was conducted, and the

second presents how the data analysis was performed.

5.1 Data Collection

Data collection was applied as described in section

4.1. The search string used was (“Software” AND

“Grounded Theory”). After the ﬁltering process,

ENASE 2022 - 17th International Conference on Evaluation of Novel Approaches to Software Engineering

524

70 research papers were selected. Figure 3 (a), de-

picts an activity diagram with the process applied.

This research is based on a previous systematic map-

ping conducted by the authors to study where and

how Grounded Theory was applied in Software De-

velopment through the following research questions:

RQ1:, Where is the Grounded Theory (GT) appropri-

ate within the Software Development study?, RQ2:

Is the GT applied correctly in the process and tasks of

Software Development?, RQ3: Is the GT useful for

Software Engineers in the industry?

5.2 Data Analysis Iteration Example

This section describes the application of the ﬁrst it-

eration of GSMS data analysis in our research. Each

iteration is presented with the following structure: i),

inputs received from the previous iteration or phase

as appropriate. ii), the application of SMS steps and

GGT stages. iii), the corresponding outputs for the

next iteration/step. The input of the ﬁrst iteration is

the output of the ”Data collection” and comprises:

three research questions (RQ1, RQ2, RQ3) and 70

papers to be analyzed. It is important to remember

that open, selective, and theoretical coding give rise to

concepts, categories, and propositions, respectively.

Figure 3 (b) shows the Activity Diagram of iteration.

P1. Research

Question Definition

O1-1 Research

Question RQ1 RQ2

RQ3

Data Collection

O1-2

Pre-Classification

schema

P2. Conduct

P3. Screening

the papers

O3-1

70 papers

selected

Iteration 0

P5 Mapping

Software

Development and

Grounded Theory

lists mapping

P7 Synthesis

It generates a

new research

question

Yes

Add RQ4, RQ5

P4 Keywording

Pre-classification

Schema

Research

Question RQ4

Four levels of

classification

Research

Question RQ5

To Iteration 1

To iteration 0

PRODUCED BY AN AUTODESK STUDENT VERSION

Figure 3: (a) Data collection (b) Data analysis Iteration 0.

Keywording has become within the GSMS a con-

stant comparison of data and abstractions through the

open, selective, and theoretical coding. The keyword-

ing tries to ﬁnd a classiﬁcation schema in a GSMS,

but this must arise from previously established knowl-

edge existing in the literature. For the initial list of

topics in open coding, we started from the two con-

cepts used to conduct the search phase: “software de-

velopment” and “grounded theory”. These concepts

were used because both are consolidated in the sci-

entiﬁc literature. The open coding of keywording

resulted in two lists, one for Software Development

terms obtained from Swebok v3.0 (that is the body of

knowledge for Software Development)(Bourque and

Fairley, 2014) and the other corresponding to the vari-

ants of Grounded Theory. (Urquhart, 2001). The

application of selective coding produced categories.

These categories were the lists that emerge from Soft-

ware Development and Grounded Theory. In the case

of Software Development, the list contains the main

10 SD processes obtained from Swebok v3.0. The

Grounded Theory list has only two elements repre-

senting the GT variants: Glaserian Grounded or Clas-

sical Grounded, and Straussian Grounded or Evolved

Grounded.

Finally, theoretical coding seeks to establish

propositions. These propositions, that we call pre-

classiﬁcation schema are more abstract and deeper

levels of the categories obtained in selective coding,

prove that they encompass several concepts and are a

more speciﬁc, reﬁned proposition and an evolution in

Software Development and Grounded Theory. Un-

fortunately, in iteration 0, these lists by themselves

cannot generate relevant categories and propositions.

Mapping starts with its open coding and is based

on the elements of the Pre-classiﬁcation schema, as-

signing a code to each item in both lists. Later, the se-

lected articles were cataloged within the codes of the

pre-classiﬁcation schema. Selective coding in map-

ping is interested in summarizing the number of arti-

cles that can be cataloged within the two lists of the

Pre-classiﬁcation schema.

Finally, we can code 41 articles corresponding to

(58,5%) in the scope of software development and 34

articles corresponding to 48.57%, in grounded the-

ory. Theoretical Coding requires establishing propo-

sitions based on the results previously obtained. It an-

alyzed the mapping results about the classiﬁcation of

the articles. It was determined that 20 articles (28.6%)

had some codes from both lists, 18 (25.7%) papers

were not on either list, 19 (27.1%) were coded only

with terms of the Software Development list, and 13

articles (18.6%) were coded only with terms of the

grounded theory list.

Glaserian Systematic Mapping Study: An Integrating Methodology

525

The obtained Outcomes of this iteration 0 are:











{

∑

i=1

}

;

Where Q

is set RQ at iteration i

= [RQ

, RQ

];

Where Q

is set RQ from Data collection

= Q

+ [RQ

, RQ

];

= [RQ

, RQ

];

is the set of AQ of iteration 0,

(1) = ∅; A

(2) = ∅; A

(3) = ∅;

(4) = ∅; A

(5) = ∅;

Where A

(k) is the set of Answers to RQ

(4)

Synthesis and Outcomes we did not obtain

enough representative results when classifying the

documents. Even when the two topics were consol-

idated in the specialized literature, this ﬁrst classiﬁca-

tion rate was too low. The synthesis highlighted the

lack of proper classiﬁcation of the selected works be-

cause most of our articles could not be classiﬁed. It

lead us to propose two new research questions: RQ4:

Is there a way to categorize the documents within GT

in SD to increase the rate of cataloged papers? RQ5:

Is there any other way to categorize software than the

ones provided by the pre-classiﬁcation schema?

None of the saturation conditions is met, therefore it

is needed to move to another iteration.

Comparison with Literature: Analyzing the exist-

ing literature reviews on software engineering about

the variants of GT (Kroeger et al., 2014; Stol et al.,

2016), they also had a low rate of classiﬁcation, prob-

ably because it has not been adequately deepened.

6 CONCLUSION

This paper represents a step forward to applying

systematic mapping analysis by enriching it with

grounded theory practices. The result is what we call

Glaserian Systematic Mapping Study. It combines the

rigour of systematic mapping and the ﬂexibility of

grounded theory. GSMS is more powerful than SMS

because coding activities are conducted in each itera-

tion, allowing new knowledge to emerge.

This publication contributes to in the following as-

pects: i) We have not found previous attempts to for-

malize the GT processes. ii) This formalization cre-

ates a new way of deﬁning saturation through three

equations in terms of research questions, their an-

swers, and the concepts comprising the answers. iii)

Research questions can be deepened as the iterations

progress, thus achieving answers to deeper and more

speciﬁc questions. iv) The answers that emerge as

part of the iterations can be confronted by other ﬁnd-

ings compared with the literature, allowing these ﬁnd-

ings to be validated. v) Findings can be validated in

each iteration according to their application through

the rigor and relevance assessment stage.

GSMS incorporates the SMS scalability and the

GGT systematization. In the GSMS, the SMS steps

have been enriched with the data analysis tools pro-

vided by the GGT, giving more depth to the results,

especially the steps of Keywording, Mapping, and

Synthesis. It is also able to evaluate the scientiﬁc

rigour and industrial relevance of the results across

iterations.

The GSMS improved the classiﬁcation rates com-

pared to SMS. It also has the advantage of adding new

research questions that arise without having to restart

the research process. In our case, applying SMS, only

55.7% of the articles were classiﬁed, but applying

GSMS our classiﬁcation rate exceeded 80%.

REFERENCES

Adolph, S., Hall, W., and Kruchten, P. (2008). A Method-

ological Leg to Stand on: Lessons Learned Using

Grounded Theory to Study Software Development.

CASCON ’08, pages 13:166–178, NY, USA. ACM.

Bourque, P. and Fairley, R. E. (2014). Guide to the Software

Engineering Body of Knowledge (SWEBOK(R)): Ver-

sion 3.0. IEEE CS Press, CA, USA, 3rd edition.

Chun Tie, Y., Birks, M., and Francis, K. (2019). Grounded

theory research: A design framework for novice re-

searchers. SAGE open medicine, 7.

Crabtree, C. A., Seaman, C. B., and Norcio, A. F. (2009).

Exploring language in software process elicitation: A

grounded theory approach. In ESEM 2009.

Finfgeld-Connett, D. (2014). Use of content analysis

to conduct knowledge-building and theory-generating

qualitative systematic reviews. Qualitative research,

14(3):341–352.

Glaser, B. G. (1992). Basics of grounded theory analysis.

Mill Valley, Calif. : Sociology Press.

Glaser, B. G. and Strauss, A. L. (1973). The Discovery

of Grounded Theory: Strategies for Qualitative Re-

search. Aldine.

Ivarsson, M. and Gorschek, T. (2010). A method for

evaluating rigor and industrial relevance of technol-

ogy evaluations. Empirical Software Engineering,

16(3):365–395.

Kitchenham, B. A., Budgen, D., and Pearl Brereton, O.

(2011). Using mapping studies as the basis for further

research - A participant-observer case study. Informa-

tion and Software Technology, 53(6):638–651.

Kroeger, T. A., Davidson, N. J., and Cook, S. C. (2014). Un-

derstanding the characteristics of quality for software

engineering processes: A Grounded Theory investiga-

tion. IST, 56(2):252–271.

Parizi, R. M., Gandomani, T. J., and Nafchi, M. Z. (2014).

Hidden facilitators of agile transition: Agile coaches

and agile champions. In 2014 8th Malaysian Software

Engineering Conf., MySEC 2014, pages 246–250.

ENASE 2022 - 17th International Conference on Evaluation of Novel Approaches to Software Engineering

526

Paternoster, N., Giardino, C., Unterkalmsteiner, M.,

Gorschek, T., and Abrahamsson, P. (2014). Software

development in startup companies: A systematic map-

ping study. IST, 56(10):1200–1218.

Petersen, K., Vakkalanka, S., and Kuzniarz, L. (2015).

Guidelines for conducting systematic mapping stud-

ies in software engineering: An update. Information

and software technology, 64:1–18.

Sharma, R. and Biswas, K. K. (2015). Functional Require-

ments Categorization Grounded Theory Approach. In

ENASE 2015, pages 301–307.

Stol, K.-J., Ralph, P., and Fitzgerald, B. (2016). Grounded

theory in software engineering research: A Critical

Review and Guidelines. In ICSE ’16, pages 120–131.

Strauss, A. and Corbin, J. (1990). Basics of Qualitative Re-

search: Grounded Theory Procedures and Techniques.

Newbury Park, CA: Sage.

Stray, V., Sjøberg, D. I., and Dyb

a, T. (2016). The daily

stand-up meeting: A grounded theory study. Journal

of Systems and Software, 114:101–124.

Urquhart, C. (2001). An encounter with grounded theory:

Tackling the practical and philosophical issues. In

Qualitative research in IS: Issues and trends, pages

104–140. IGI Global.

Van Niekerk, J. C. and Roode, J. (2009). Glaserian and

Straussian Grounded Theory: Similar or Completely

Different ? In SAICSIT’09, number 10, pages 96–103.

Zayour, I. and Hamdar, A. (2016). A qualitative study on

debugging under an enterprise IDE. Information and

Software Technology, 70:130–139.

Glaserian Systematic Mapping Study: An Integrating Methodology

527